Data Science - Statistical Modelling and Analysis
Timeline
-
February 5, 2024Experience start
-
February 16, 2024Defining the Problem and Hypothesis
-
March 2, 2024Clean the Data
-
March 16, 2024Exploratory Data Analysis
-
April 12, 2024Data Modelling and Analysis
-
April 29, 2024Final Product
-
May 3, 2024Experience end
Timeline
-
February 5, 2024Experience start
-
February 16, 2024Defining the Problem and Hypothesis
GOAL 1. Identify the Institution – Look at the dataset you are given and see what institution might be interested in questions you might be able to answer using the dataset.
GOAL 2. Choose a relevant problem that the institution might be facing. This problem should be able to be addressed using the dataset provided. Since this is an introductory class, you might focus on problems that are not too complex and can be answered using limited data and variables. For example, you may want to see if 3 variables impact another variable and you want to describe the impacts.
GOAL 3. You will come up with 3 hypotheses. a. The final versions will require knowledge of topics covered later in the course to come up with questions that are reasonable to answer. However, at the initial stage, ask as many questions as possible. By the end of the course, you can narrow the list down the required 3. b. You also need to consider the dataset. You can only answer questions and test hypotheses for which you have relevant variables in your dataset.
GOALS 1 - 3 complete. Each group submits a 3- minute presentation of the problem and the hypotheses. The presentation is made as if the class members are the executives of the hiring institution.
-
March 2, 2024Clean the Data
GOAL 4. Cleaning the data is an essential step in the data analysis process. Without this step your results might be useless and might be measuring trends that are because of ‘dirty’ data.
The group will need to sign and submit worksheet that basically says that the data has been cleaned.
-
March 16, 2024Exploratory Data Analysis
GOAL 5. You must perform exploratory data analysis. At this stage, look all the relevant variables in the dataset (for example individually, in pairs, in threes etc. This will be dependent on the questions you are asking). After this step, you should be able to provide a qualitative and quantitative description of the dataset. This should give you an insight into possible answers to your questions and will help you check if assumptions are met.
The group will submit 2- page report of exploratory data analysis results. The report should contain results from all variables that are relevant to the problem.
-
April 12, 2024Data Modelling and Analysis
Fit models. You can use any model/method learned in the course to answer your questions. You must use at least 2 different models from in the course.
GOAL 6 complete. The group will submit a one-page report of the results.
-
April 29, 2024Final Product
GOAL 7. Finally, communicate your results. This is undoubtedly one of the most important skills to master. You must communicate your results in a non-technical way: a. so, everyone in the audience understands the story (comprehensible and compelling), and b. to keep the attention of the audience.
Groups submit completed deliverable and present their results.
-
May 3, 2024Experience end
Experience scope
Categories
Data analysis Data modelling Data scienceSkills
statistical software (r)Looking to elevate your organization, and bring it to the next level? Bring on learners from the College of Wooster to be your learner-consultants, in a project-based experience. learners will work on one main project over the experience of the semester, connecting with you as needed with virtual communication tools.
Learners in this program/experience will learn concepts, techniques and tools they need to deal with various facets of data science practice, including data collection and integration, exploratory data analysis, predictive modeling, descriptive modeling, data product creation, evaluation, and effective communication.
Learners
Deliverables are negotiable, and will seek to align the needs of the learners and the organization.
Some final project deliverables might include:
- A detailed report including their research, analysis, insights and recommendations
Project timeline
-
February 5, 2024Experience start
-
February 16, 2024Defining the Problem and Hypothesis
-
March 2, 2024Clean the Data
-
March 16, 2024Exploratory Data Analysis
-
April 12, 2024Data Modelling and Analysis
-
April 29, 2024Final Product
-
May 3, 2024Experience end
Timeline
-
February 5, 2024Experience start
-
February 16, 2024Defining the Problem and Hypothesis
GOAL 1. Identify the Institution – Look at the dataset you are given and see what institution might be interested in questions you might be able to answer using the dataset.
GOAL 2. Choose a relevant problem that the institution might be facing. This problem should be able to be addressed using the dataset provided. Since this is an introductory class, you might focus on problems that are not too complex and can be answered using limited data and variables. For example, you may want to see if 3 variables impact another variable and you want to describe the impacts.
GOAL 3. You will come up with 3 hypotheses. a. The final versions will require knowledge of topics covered later in the course to come up with questions that are reasonable to answer. However, at the initial stage, ask as many questions as possible. By the end of the course, you can narrow the list down the required 3. b. You also need to consider the dataset. You can only answer questions and test hypotheses for which you have relevant variables in your dataset.
GOALS 1 - 3 complete. Each group submits a 3- minute presentation of the problem and the hypotheses. The presentation is made as if the class members are the executives of the hiring institution.
-
March 2, 2024Clean the Data
GOAL 4. Cleaning the data is an essential step in the data analysis process. Without this step your results might be useless and might be measuring trends that are because of ‘dirty’ data.
The group will need to sign and submit worksheet that basically says that the data has been cleaned.
-
March 16, 2024Exploratory Data Analysis
GOAL 5. You must perform exploratory data analysis. At this stage, look all the relevant variables in the dataset (for example individually, in pairs, in threes etc. This will be dependent on the questions you are asking). After this step, you should be able to provide a qualitative and quantitative description of the dataset. This should give you an insight into possible answers to your questions and will help you check if assumptions are met.
The group will submit 2- page report of exploratory data analysis results. The report should contain results from all variables that are relevant to the problem.
-
April 12, 2024Data Modelling and Analysis
Fit models. You can use any model/method learned in the course to answer your questions. You must use at least 2 different models from in the course.
GOAL 6 complete. The group will submit a one-page report of the results.
-
April 29, 2024Final Product
GOAL 7. Finally, communicate your results. This is undoubtedly one of the most important skills to master. You must communicate your results in a non-technical way: a. so, everyone in the audience understands the story (comprehensible and compelling), and b. to keep the attention of the audience.
Groups submit completed deliverable and present their results.
-
May 3, 2024Experience end
Project Examples
Requirements
Learners in groups of 2-4 will work with your company to identify your needs and provide actionable recommendations, based on their in-depth research and analysis.
Project activities that learners can complete may include, but are not limited to:
- Identify problems based on your company's case study that are solvable with data science.
- Use R to carry out basic statistical modeling and analysis.
- Apply basic tools (plots, graphs, summary statistics) to carry out EDA.
- Create effective visualization of given data (to communicate or persuade).
- Conduct and apply ethical practices based on your company's case study.
Additional company criteria
Companies must answer the following questions to submit a match request to this experience:
Be available for a quick phone/virtual call with the instructor to initiate your relationship and confirm your scope is an appropriate fit for the experience.
Provide a dedicated contact person who is available for weekly/bi-weekly drop-ins to address learners’ questions as well as periodic messages over the duration of the project
Provide an opportunity for learners to present their work and receive feedback.
Provide relevant information and/or data as needed for the project.
How is your project relevant to the experience?
Timeline
-
February 5, 2024Experience start
-
February 16, 2024Defining the Problem and Hypothesis
-
March 2, 2024Clean the Data
-
March 16, 2024Exploratory Data Analysis
-
April 12, 2024Data Modelling and Analysis
-
April 29, 2024Final Product
-
May 3, 2024Experience end
Timeline
-
February 5, 2024Experience start
-
February 16, 2024Defining the Problem and Hypothesis
GOAL 1. Identify the Institution – Look at the dataset you are given and see what institution might be interested in questions you might be able to answer using the dataset.
GOAL 2. Choose a relevant problem that the institution might be facing. This problem should be able to be addressed using the dataset provided. Since this is an introductory class, you might focus on problems that are not too complex and can be answered using limited data and variables. For example, you may want to see if 3 variables impact another variable and you want to describe the impacts.
GOAL 3. You will come up with 3 hypotheses. a. The final versions will require knowledge of topics covered later in the course to come up with questions that are reasonable to answer. However, at the initial stage, ask as many questions as possible. By the end of the course, you can narrow the list down the required 3. b. You also need to consider the dataset. You can only answer questions and test hypotheses for which you have relevant variables in your dataset.
GOALS 1 - 3 complete. Each group submits a 3- minute presentation of the problem and the hypotheses. The presentation is made as if the class members are the executives of the hiring institution.
-
March 2, 2024Clean the Data
GOAL 4. Cleaning the data is an essential step in the data analysis process. Without this step your results might be useless and might be measuring trends that are because of ‘dirty’ data.
The group will need to sign and submit worksheet that basically says that the data has been cleaned.
-
March 16, 2024Exploratory Data Analysis
GOAL 5. You must perform exploratory data analysis. At this stage, look all the relevant variables in the dataset (for example individually, in pairs, in threes etc. This will be dependent on the questions you are asking). After this step, you should be able to provide a qualitative and quantitative description of the dataset. This should give you an insight into possible answers to your questions and will help you check if assumptions are met.
The group will submit 2- page report of exploratory data analysis results. The report should contain results from all variables that are relevant to the problem.
-
April 12, 2024Data Modelling and Analysis
Fit models. You can use any model/method learned in the course to answer your questions. You must use at least 2 different models from in the course.
GOAL 6 complete. The group will submit a one-page report of the results.
-
April 29, 2024Final Product
GOAL 7. Finally, communicate your results. This is undoubtedly one of the most important skills to master. You must communicate your results in a non-technical way: a. so, everyone in the audience understands the story (comprehensible and compelling), and b. to keep the attention of the audience.
Groups submit completed deliverable and present their results.
-
May 3, 2024Experience end