Learning objectives
In this lesson, we will start the process of simulating an election from polling data by learning how to read this polling data from a file. We will apply our work with parsing to two tasks:
- Reading a file containing electoral college votes into a map of strings to (unsigned) integers that associates each state to its number of electoral votes.
- Reading a file containing polling data into a map of strings to (unsigned) integers that associates each state to a candidate’s polling percentage in that state.
Although these tasks are specific, once we can read data from files, we will obtain a vital transferable skill that will give us confidence to build larger projects involving larger files.
Setup
We are providing starter code and data in the form of a folder election.zip. Download this file, expand its contents into a folder election
, and move this folder into your go/src
source code directory.
The election
directory contains the following two Go files and represents our first example of having multiple such files in the same directory. As our programs grow, we will divide out code into multiple files based on accomplishing different tasks.
- a
main.go
file, where we will place our code for running the election simulator in the next code along; - an
io.go
file, where we will place code in this code along for parsing data from files.
It also contains a data
directory, which contains useful data for simulating the 2016 US presidential election in the form of four files. The electoralVotes.csv
file contains electoral votes for all states; each line contains a state name, followed by a comma, followed by its number of electoral votes. The remaining three text files contain polling data; each line of a file contains a state name, followed by the polling percentage for candidate 1 (Clinton), followed by the polling percentage for candidate 2 (Trump). The polling data in these files was sampled at different times before the 2016 election as described below.
earlyPolls.csv
: contains polls from summer 2016.conventions.csv
: contains polls from around the Republican and Democratic National Conventions in mid- and late July 2016.debates.csv
: contains polls from around the presidential debates, in late September through mid-October 2016.
We encourage you to explore these files before starting the code along, but don’t change anything!
Code along video
Beneath the video, we provide a detailed summary of the topics and code covered in the code along.
Although we strongly suggest completing the code along on your own, you can find completed code from the code along in our course code repository.
Code along summary
Reading Electoral College votes from file
To build our election simulator, we will need two functions for reading data from a file that we will place in io.go
. We focus first on reading the data in electoralVotes.csv
, which has the form shown in the figure below.
Our function for parsing the electoral vote data, ReadElectoralVotes()
, takes as input a string filename
. It returns a map of strings to unsigned integers that we have been calling electoralVotes
and that maps a state’s name to its number of Electoral College votes. We first create this map, which we will eventually return.
// ReadElectoralVotes takes in a filename from which to read electoral votes. // It returns a map associating the name of each state to the state's number of Electoral College Votes. func ReadElectoralVotes(filename string) map[string]uint { electoralVotes := make(map[string]uint) //to fill in return electoralVotes }
Next, we read in the file using the command os.ReadFile()
. This command allows us to use the "os"
package imported at the top of io.go
, and much like the io.ReadAll()
function that we introduced in the previous chapter, it returns a slice of byte
variables corresponding to every symbol present in the file, along with an error variable. We denote these variables as fileContents
and err
, respectively.
As previously, if err
has any value other than nil
, we need to log an error, which we will do using panic(err)
. We have written the code
if err != nil { panic(err) }
quite frequently in our work thus far in the course, which allows us to apply a coding dogma that duplicated code implies that we should use a subroutine. In this case, we place this code into a Check()
subroutine.
// ReadElectoralVotes takes in a filename from which to read electoral votes. // It returns a map associating the name of each state to the state's number of Electoral College Votes. func ReadElectoralVotes(filename string) map[string]uint { electoralVotes := make(map[string]uint) //read file contents fileContents, err := os.ReadFile(filename) Check(err) //to fill in return electoralVotes } //Check takes as input a variable of error type. //If the variable has any value other than nil, it panics. func Check(e error) { if e != nil { panic(e) } }
We would like to convert the data into a slice in which each line of the file corresponds to a single element of the slice. First, as we learned in the previous chapter, we can convert the fileContents
slice of byte
variables into a single string giantString
that concatenates all these symbols together using string(fileContents)
. We then apply a strings.Split()
function from the "strings"
package, which splits giantString
into a slice of strings, stopping the current string every time we encounter a given separator symbol. In this case, that symbol is the new line character, denoted "\n"
, so that the slice of strings obtained after calling strings.Split()
contains a single string for each line in the file.
// ReadElectoralVotes takes in a filename from which to read electoral votes. // It returns a map associating the name of each state to the state's number of Electoral College Votes. func ReadElectoralVotes(filename string) map[string]uint { electoralVotes := make(map[string]uint) //read file contents fileContents, err := os.ReadFile(filename) Check(err) giantString := string(fileContents) lines := strings.Split(giantString, "\n") //to fill in return electoralVotes } //Check takes as input a variable of error type. //If the variable has any value other than nil, it panics. func Check(e error) { if e != nil { panic(e) } }
Now that we have separated all the lines in the file, we should range over these lines and parse each one, which contains a state name, followed by a comma, followed by the number of electoral votes for that state. Here, we can again use strings.Split()
to divide each line of the file when we encounter a comma into a slice containing two strings corresponding to that line’s entries, which we call lineElements
.
// ReadElectoralVotes takes in a filename from which to read electoral votes. // It returns a map associating the name of each state to the state's number of Electoral College Votes. func ReadElectoralVotes(filename string) map[string]uint { electoralVotes := make(map[string]uint) //read file contents fileContents, err := os.ReadFile(filename) Check(err) giantString := string(fileContents) lines := strings.Split(giantString, "\n") //now, parse out data on each line and add to electoralVotes map for _, currentLine := range lines { lineElements := strings.Split(currentLine, ",") //to fill in } return electoralVotes } //Check takes as input a variable of error type. //If the variable has any value other than nil, it panics. func Check(e error) { if e != nil { panic(e) } }
We are now able to access each of the two strings that we have parsed. The first, lineElements[0]
, is a string containing a state name, which is what we want; the second, lineElements[1]
, is a string representing the number of electoral votes, and we need to use our old friend strconv.Atoi()
to convert it to an integer votes
.
// ReadElectoralVotes takes in a filename from which to read electoral votes. // It returns a map associating the name of each state to the state's number of Electoral College Votes. func ReadElectoralVotes(filename string) map[string]uint { electoralVotes := make(map[string]uint) //read file contents fileContents, err := os.ReadFile(filename) Check(err) giantString := string(fileContents) lines := strings.Split(giantString, "\n") //now, parse out data on each line and add to electoralVotes map for _, currentLine := range lines { lineElements := strings.Split(currentLine, ",") stateName := lineElements[0] votes, err := strconv.Atoi(lineElements[1]) // to fill in } return electoralVotes } //Check takes as input a variable of error type. //If the variable has any value other than nil, it panics. func Check(e error) { if e != nil { panic(e) } }
After applying our Check()
subroutine again to ensure that strconv.Atoi()
did not introduce an error, all that remains is to convert votes
to an unsigned integer and assign it to electoralVotes[stateName]
.
// ReadElectoralVotes takes in a filename from which to read electoral votes. // It returns a map associating the name of each state to the state's number of Electoral College Votes. func ReadElectoralVotes(filename string) map[string]uint { electoralVotes := make(map[string]uint) //read file contents fileContents, err := os.ReadFile(filename) Check(err) giantString := string(fileContents) lines := strings.Split(giantString, "\n") //now, parse out data on each line and add to electoralVotes map for _, currentLine := range lines { lineElements := strings.Split(currentLine, ",") stateName := lineElements[0] votes, err := strconv.Atoi(lineElements[1]) Check(err) electoralVotes[stateName] = uint(votes) } return electoralVotes } //Check takes as input a variable of error type. //If the variable has any value other than nil, it panics. func Check(e error) { if e != nil { panic(e) } }
STOP: Now that we have written these functions, let’s ensure that they compile. Open a terminal application, navigate into thego/src/election
directory, and compile your code using the commandgo build
. (We don’t yet have any code to run.)
Reading polling data from file
The second function we will write, ReadPollingData()
, also takes as input a string filename
. It returns a map of strings to decimals that we have been calling polls
, and that maps a state’s name to the current polling percentage for candidate 1.
Note: Because we are assuming only a two-candidate race, we can access the polling percentage for candidate 2 by subtracting candidate 1’s polling percentage from 1.
Most of this function proceeds similarly to ReadElectoralVotes()
; we open the file, convert its contents to a string, and then split that string into a slice of strings lines
.
// ReadPollingData takes in a filename from which to read data. // It returns a map of state names to polling percentage for candidate 1. func ReadPollingData(filename string) map[string]float64 { candidate1Percentages := make(map[string]float64) fileContents, err := os.ReadFile(filename) Check(err) giantString := string(fileContents) lines := strings.Split(giantString, "\n") // to fill in return candidate1Percentages }
The difference arises because each line in the polling data files has three elements instead of two (see figure below), corresponding to the state name and the polling percentages for both candidates. However, as mentioned previously, we only need to read in the percentage for candidate 1, so the only difference with ReadElectoralVotes() is that we need to parse the second value in each row as a decimal; we then divide this value, represented as a percentage, by 100.
// ReadPollingData takes in a filename from which to read data. // It returns a map of state names to polling percentage for candidate 1. func ReadPollingData(filename string) map[string]float64 { candidate1Percentages := make(map[string]float64) fileContents, err := os.ReadFile(filename) Check(err) giantString := string(fileContents) lines := strings.Split(giantString, "\n") //now, parse each line and add data to map for _, currentLine := range lines { lineElements := strings.Split(currentLine, ",") stateName := lineElements[0] percentage1, err := strconv.ParseFloat(lineElements[1], 64) Check(err) // normalize percentages and enter map value candidate1Percentages[stateName] = percentage1 / 100.0 } return candidate1Percentages }
STOP: Once again, check that our code compiles by executing the terminal commandgo build
.
Ensuring that our code works, and looking ahead
In the next code along, we will write code in main.go
that, after reading in the data using the functions that we have written, will run a Monte Carlo simulation of multiple elections using this polling data.
For now, we will ensure that our code compiles and is behaving correctly. In main.go
, we will declare two strings to store the file names of our electoral votes and polling data. Because main.go
resides in the election
folder, and we need to access the data
subdirectory, we will need to add the suffix data/
before our desired file name.
After reading in the data into the electoralVotes
and
maps, we will print these maps. The result is not pretty, but it shows us that the file parsing code is working.pollingData
package main import ( "fmt" ) func main() { fmt.Println("Simulating the 2016 US Presidential election.") electoralVoteFile := "data/electoralVotes.csv" pollFile := "data/earlyPolls.csv" electoralVotes := ReadElectoralVotes(electoralVoteFile) polls := ReadPollingData(pollFile) fmt.Println(electoralVotes) fmt.Println(polls) }
We can now compile all of the code present in the election
folder using the single terminal command go build
. After doing so, we will run our code by executing ./election
(Mac) or election.exe
(Windows).
The result of printing the maps is not pretty, but it tells us that the code is working. Delete the two print statements, and you will be ready for our next code along!