Description
Introduction
Are all datasets created equal? If you had no prior exposure to manipulating data in Excel or in another similar spreadsheet application, the data science students need to become familiar and comfortable with applying formulas and other operations to understand and “massage” data prior to performing analysis. But those who have utilized formulas and performed calculations in Excel in the past may have not done it with healthcare data sets. In this unit, we learn what kinds of data could come out of a dataset extracted from a healthcare enterprise data warehouse, get comfortable with summation of data, identify and understand core terminology of the healthcare data, look for missing data and understand its implications on analysis of outcomes, and prepare a set to be utilized for its intended purpose.
There are typically more than one methods of approaching a task in Excel or similar program, especially when it comes to calculations and running descriptive statistics. You will notice a few hints included throughout the assignment that will guide you through some comprehensive dataset analysis and preparation tasks, but primarily students will be on their own to explore the set and make key decisions in regards to techniques applied to solve simple and complex problems within it. There are many online guides for solving Excel problems freely available on the Internet, but those who had little exposure to Excel as a statistical and data conversion tool may wish to obtain a book from many choices typically available at a local bookstore or library. No latest version of a book or the software application itself are typically necessary. Any relatively recent version will do, and the same applies to a PC operating system.