DATA SOURCE: Locke5 Datasets

The textbook Statistics: Unlocking the Power of Data (2/e 2017) by Lock, Lock, Lock, Lock, and Lock has publicly available datasets and interactive applets.

A list of the more than 100 datasets in the collection is available at this site. There is also an R package, Lock5Data, that contains documentation for the data.

Example: Restaurant tips

The data frame RestaurantTips contains, … well let’s look directly at the documentation.

library(Lock5Data)
help(RestaurantTips)
## _R_e_s_t_a_u_r_a_n_t _T_i_p_s
## 
## _D_e_s_c_r_i_p_t_i_o_n:
## 
##      Tip data from the First Crush Bistro
## 
## _F_o_r_m_a_t:
## 
##      A dataset with 157 observations on the following 7 variables.
## 
##          'Bill'  Size of the bill (in dollars)                                                         
##           'Tip'  Size of the tip (in dollars)                                                          
##        'Credit'  Paid with a credit card?  'n' or 'y'                                                  
##        'Guests'  Number of people in the group                                                         
##           'Day'  Day of the week: 'm'=Monday, 't'=Tuesday, 'w'=Wednesday, 'th'=Thursday, or 'f'=Friday 
##        'Server'  Code for specfic waiter/waitress: 'A',  'B', or 'C'                                   
##        'PctTip'  Tip as a percentage of the bill                                                       
##       
## _D_e_t_a_i_l_s:
## 
##      The owner of a bistro called First Crush in Potsdam, NY was
##      interested in studying the tipping patterns of his customers.  He
##      collected restaurant bills over a two week period that he believes
##      provide a good sample of his customers. The data recorded from 157
##      bills include the amount of the bill, size of the tip, percentage
##      tip, number of customers in the group, whether or not a credit
##      card was used, day of the week, and a coded identity of the
##      server.
## 
## _S_o_u_r_c_e:
## 
##      Thanks to Tom DeRosa at First Crush for providing the tipping
##      data.

Some simple questions to ask:

  • Does the tip depend on the size of the bill?
gf_point(Tip ~ Bill, data = RestaurantTips) %>%
  gf_lm()

  • Does the tip (as a percent of the bill) depend on whether the bill was paid with a credit card?
gf_jitter(Tip ~ Credit, data = RestaurantTips)  %>%
  gf_violin(fill = "blue", alpha = 0.3, color  = NA)