Description
Hi Students,
I am a medical data analyst at University of Illinois at Chicago, and I have a problem, or to be honest – a few problems – on which I need your help. I received a raw data set from our Enterprise Data Warehouse team, and while I trust that it’s all decent data, I need to sort it, clean it, prep it, so I can use it for some future operations. Please assist me!!! My medical data set represents all patients for a defined period of time who were admitted into 2 hospitals with a chief or secondary diagnosis of AMI – Acute Myocardial Infarction. All of our operations for working with this data are expected to be completed in Excel.
I have two numbered facilities in the FacilityKey column, but have no idea what medical centers those numbers represent. Let’s just say I was verbally told, but it’s not on the spreadsheet. I need to create a new column next to FacilityKey with corresponding text values.
My values are:
Facility Key
- Medical Center Name
5
Illinois Medical Care Center
8
Kentucky Best Care
There are multiple ways of doing this, so choose the one that suits you best. (2 points).
Next, I need to sort the data by (1) Facility Key and (2) Readmission code (HasReadmit column). The reason is that we must eventually identify which and how many patients were readmitted within 30 days from each of the facilities.
Hint: Be sure to sort the entire spreadsheet, and not just individual columns. Sorting incorrectly will mix up numbers in the other columns unrelated to facility key and readmissions. (2 points).
Now, please provide me with the number of readmitted patients for each facility. (2 points):
Facility
Number of Readmitted Patients
Illinois Medical Care Center
Kentucky Best Care
Hmmm…. Are the numbers reported under Question 3 correct? Explain why (yes) or why not? (1 point.)
My encounter numbers in the EncounterNumber field are fine, but I really need them to look like 9-digit codes adopted by my program and evaluation process. So, I need to convert these numbers to a 9-digit format. There are several ways of doing it. Any way you prefer works for me. (2 points).
Hint: one of these several ways is to add a new column and use the “text” formula, subsequently replacing the “old” column with a “new” column when satisfied with outcomes. Also consider “filling” numbers in once you figure out the formula part.
My data includes the dates of admission (AdmitDate) and discharge (DischargeDate), but what I really need to know is the length of stay (LOS). LOS is the number of days between these dates. LOS is one of the core measures of quality and financial analysis in healthcare. It is also one of the major variables and benchmarks measured by hospital administrators. Please assist me by creating a new column and calculating the time, in days, between admission and discharge for each patient. Label this column LOS (for the length of stay). (2 points).
Hint: one of several ways to accomplish this task is by employing the “datedif” function. You may also want to google How do I calculate
Note: for a standard definition of LOS you might want to check here – https://www.easycalculation.com/medical/learn-length-of-stay.php.
After looking at the data, I can see the dates for performing principal procedures, but I have little idea how much time elapsed between the date of admission and the date of performing the principal procedure. Please help me calculate the time, in days, by introducing a new column called TimeToProcedure. After you completed the task, review numbers in the new field. Do all look right? If not – what is wrong and what would you do about it? Take an action on the spreadsheet and explain in your Word document. (1 point).
Hmmm… Are all my TimeToProcedure values in? Explain why (yes) or why not? (1 point.)
I would like to know how many of my patients were covered by Medicare. (1 point.)
Hint: think of ways to use programmatic / automated methods to perform these calculations, unless you think you would enjoy performing manual counts
How many patients are age 80 or older? (1 point.)
Hint: think of ways to use programmatic / automated methods to perform these calculations, unless you think you would enjoy performing manual counts
How many patients in this data set have died (expired)? (1 point.)
- Hint: think of ways to use programmatic / automated methods to perform these calculations, unless you think you would enjoy performing manual counts
Charlson Comorbidity Code (CCI) is a calculated measure of complications, computed based on a formula that assigns certain values to specific pre-existing and post-treatment illnesses. The higher the CCI total, the sicker the patient. How many patients that scored CCI of 10 or greater do we have in this data set? (1 point).
Create a new tab in your spreadsheet. Create the following simple table on this new tab. Sort by CCI, highest to lowest scores. (2 points).
Facility
Encounter Number
Principal Diagnosis Description
Readmission (no = 0, yes = 1)
CCI
Can we definitively state that we calculated CCI for all patients? Explain why(yes) or why not? (2 points).
- The data set includes a column with descriptions of the patient’s race, but I cannot use English descriptions for any descriptive statistics. Computers work with codes. Your goal is to use the following table of supplied race codes to add a new column next to the race description column and insert these numerical codes. You do have a column with letter codes on the spreadsheet, but unfortunately it is not particularly useful. (2 points).
Race
Numerical Code
African American / Black
10
American Indian / Alaska Native
11
Asian / Filipino
12
Caucasian / White
13
Declined
14
Eastern Indian
15
Hispanic
16
Middle Eastern
17
Other
18
Pacific Islander / Hawaiian
- 19
Unknown
- 20
Hint: while there are multiple ways of accomplishing this task, you might like the ‘vlookup’ function to look up code number and insert the corresponding description.
What percentages of male and female patients do I have in my data set?
Hint: do not attempt to count manually, unless you possess that much time to do it… Think of ways to sort data to obtain numbers programmatically and perform basic mathematical calculation of the proportion versus the total number of patients. (2 points).