Description
Instructions for Tasks
The full details of the tasks are in the attached Word Document. There are total of three tasks; you are expected to answer all the questions of the tasks. Task 1 is application of descriptive data. The task 2 will practically be an application and understanding of Regression and Classification Technique. The task 3 is a Data Mining Report. The maximum permitted word count for all the tasks is 2000 words. A guided word count for each task is provided.
General Instructions:
- Word Limit: Ensure you adhere to the total word limit of 2000 words for all tasks combined. Any content beyond this limit will not be considered for grading.
- Referencing: Ensure any external sources, data, or references are appropriately cited. Plagiarism is strictly prohibited. No use of GPT
Task 1: Application of Descriptive Data (Approx. 100 words):
- Dataset Analysis: Begin by understanding and exploring the dataset provided. Identify key variables and their types.
- Descriptive Statistics: Using tools such as Excel, R, or Python, calculate the basic descriptive statistics like mean, median, mode, standard deviation, and variance.
- Visualization: Develop relevant visualizations like bar charts, histograms, or scatter plots to represent the data distribution and patterns.
- Interpretation: Provide insights on the data distribution, trends, and any patterns you observe.
- Conclusion: Summarize the findings from your descriptive analysis.
Task 2: Regression and Classification Technique (Approx. 300 words):
- Selection of Technique: Decide whether regression or classification is more appropriate based on the nature of your dependent variable.
- Data Preparation: Ensure the data is cleaned, missing values are handled, and is split into training and testing sets.
- Model Development:
- For Regression: Determine the dependent and independent variables. Use a relevant regression technique (linear, logistic, etc.) to develop your model.
- For Classification: Choose an appropriate algorithm (e.g., Decision Trees, k-NN, SVM) based on the nature of your data.
- Model Evaluation: Use appropriate metrics (like R-squared for regression, accuracy/precision/recall for classification) to evaluate your model’s performance on the test data.
- Interpretation: Discuss the model’s results, the significance of variables, and any patterns observed.
- Conclusion: Sum up the findings and potential implications of your model.
Task 3: Data Mining Report (Approx. 1600 words):
- Introduction: Provide a brief overview of the context and objectives of your data mining task.
- Data Understanding: Describe the dataset, including the source, variables, and any initial observations.
- Data Preparation: Document any preprocessing steps you undertook, like normalization, handling missing values, or feature engineering.
- Data Mining Techniques: Discuss the techniques you employed (e.g., clustering, association rule mining, neural networks) and the rationale behind choosing them.
- Results: Present the findings from your data mining process. Use visualizations where necessary to showcase patterns or insights.
- Challenges: Highlight any challenges faced during the process, whether related to data quality, model performance, or interpretation.
- Conclusion and Recommendations: Conclude your report by summarizing key findings. Additionally, provide recommendations or potential applications of your insights.