Data Files

All data files in the book are simulated data, except for two public web site logs in Chapter 9. This is because few industry data sets are publicly available. Note that no proprietary data from Amazon, Code for America, Google, Microsoft, Observable or other affiliations of the authors was used in this book.

Data for each chapter can be downloaded on demand using the commands in the book. Alternatively, to work offline you could download the files all at once in a single .ZIP (here: immediately download ZIP file of all code and data).

Data Access Methods

  • Direct downloads (short URL). These are shown in the book and use short URLs for convenience. For example: read.csv("https://bit.ly/3WgWouJ").

    The list of data file names for short URL download is:
    
    https://bit.ly/3Us5jrR  (Chapter 5)
    https://bit.ly/3TIVTrP  (Chapter 5)
    https://bit.ly/3WgWouJ  (Chapter 8)
    https://bit.ly/3DnYEaV  (Chapter 9; RDS file)
    https://bit.ly/3FvVoNE  (Chapter 10)
    https://bit.ly/3SRnq9l  (Chapter 10)
                            
    Commands to read those in R (without assigning to an object) are:
    
    read.csv("https://bit.ly/3Us5jrR")             # Chapter 5
    read.csv("https://bit.ly/3TIVTrP")             # Chapter 5
    read.csv("https://bit.ly/3WgWouJ")             # Chapter 8
    readRDS(gzcon(url("https://bit.ly/3DnYEaV")))  # Chapter 9
    read.csv("https://bit.ly/3FvVoNE")             # Chapter 10
    read.csv("https://bit.ly/3SRnq9l")             # Chapter 10
                            
  • Direct downloads (long URL). This works with full URLs hosted on this site. These have the format https://quantuxbook.com/data/FILENAME. An example is shown further down on this page.

    The list of data file names for long URL download is:
    
    https://quantuxbook.com/data/statistics-dat1.csv            (Chapter 5)
    https://quantuxbook.com/data/statistics-dat2.csv            (Chapter 5)
    https://quantuxbook.com/data/csat-data.csv                  (Chapter 8)
    https://quantuxbook.com/data/epa-server.rds                 (Chapter 9; RDS file)
    https://quantuxbook.com/data/qualtrics-pizza-maxdiff.csv    (Chapter 10)
    https://quantuxbook.com/data/qualtrics-maxdiff-usecases.csv (Chapter 10)
                            
    Commands to read those in R (without assigning to an object) are:
    
    read.csv("https://quantuxbook.com/data/statistics-dat1.csv")
    read.csv("https://quantuxbook.com/data/statistics-dat2.csv")
    read.csv("https://quantuxbook.com/data/csat-data.csv")
    readRDS(gzcon(url("https://quantuxbook.com/data/epa-server.rds")))
    read.csv("https://quantuxbook.com/data/qualtrics-pizza-maxdiff.csv")
    read.csv("https://quantuxbook.com/data/qualtrics-maxdiff-usecases.csv")
                            
  • Alternative for offline access, all chapters: get the single ZIP file of all code and data (warning: downloads immediately). Unzip it and follow the instructions above for single files.

Example Code Using Direct URLs in R

Assuming you have Internet access, you can download data directly into R, as shown in the book. Following is an example::

> csat.data <- read.csv("https://quantuxbook.com/data/csat-data.csv")  # direct URL download
> summary(csat.data)
     Date               Rating        Country         
 Length:36048       Min.   :1.000   Length:36048      
 Class :character   1st Qu.:4.000   Class :character  
 Mode  :character   Median :4.000   Mode  :character  
                    Mean   :4.051                     
                    3rd Qu.:5.000                     
                    Max.   :5.000                     
                

Example Code Using a Downloaded CSV File

Alternatively, if you are offline or have difficulty with direct URL downloads, you can load files saved to a local folder. First, download the ZIP file above and unzip it. Then, assuming you have downloaded csat-data.csv to the /users/chris/Downloads folder, load it in R as follows:

> csat.data <- read.csv("/users/chris/Downloads/csat-data.csv")  # load from local folder
> summary(csat.data)
     Date               Rating        Country         
 Length:36048       Min.   :1.000   Length:36048      
 Class :character   1st Qu.:4.000   Class :character  
 Mode  :character   Median :4.000   Mode  :character  
                    Mean   :4.051                     
                    3rd Qu.:5.000                     
                    Max.   :5.000