Ch0 Welcome

1 About the Notes

These are the lecture notes on descriptive analytics and data visualization. It covers a wide range of topics from descriptive statistics, exploratory data analysis, data visualization principles, basic visualization for continuous and categorical data, and advanced visualization for time series data, spatial data, network data, and text data. The notes use R throughout and includes R code and R output. The data sets used in the notes are available for download at https://github.com/yichenqin/dataviz

The course BANA4137 descriptive analytics and data visualization is an upper level undergraduate course offered at University of Cincinnati. Here is a list of course objectives.

  • Be able to define “descriptive analytics” and explain how it can be used to help make better decisions
  • Use descriptive analytics tools such as data exploration, data filtering and summary statistics to check data quality and diagnose data errors
  • Use descriptive analytics tools to interpret data and produce insights
  • Use problem-framing techniques such as influence diagrams to help define problems and provide structure
  • Explain the key design principles and techniques for visualizing data
  • Create data visualizations (charts and tables) using widely available software tools
  • Use data storytelling techniques to create meaningful presentations that convey insights from analytics
  • Perform your data analysis in a literate programming environment
  • Import and manage structured and unstructured data
  • Manipulate, transform, and summarize your data, join disparate data sources.

2 About the Instructor

Yichen Qin is an Associate Professor of Business Analytics at University of Cincinnati, Carl H. Lindner College of Business, Department of Operations, Business Analytics, and Information Systems. He earned his Ph.D. degree in Applied Mathematics and Statistics from the Johns Hopkins University in 2013. Dr. Qin teaches Descriptive Analytics and Data Visualization, Data Analysis Methods, Forecasting and Time Series Methods, and Business Analytics. Dr. Qin’s research interests include computational statistics, mixture models, robust statistics, model selection, network analysis, data visualization, and clinical trial design. For more information, please visit https://www.yichenqin.com/ or email .

3 Acknowledgements

This is a joint project with Professor Yang Li at Renmin University of China (RUC). The notes would not be possible without the help and contribution of my collaborators and students from Renmin University of China and University of Cincinnati. I am grateful for the help from Yanlei Kong, Jingru Sun, Rong Li, Jiebin Li, Qian Du, Dongzuo Liang, Huiyun Tang, Mingyue Pan, Zhao Xiong, Jiaxin Xie, Xirui Zhao, Fanglu Chen, Heming Deng, Xiaolin Xu, Mingyue Zhang, Mingcong Wu, Zewei Lin, Jiawei Huang, and Tianhai Zu.