datascience

Predicting Blood Donations

Project Description

Forecasting blood supply is a serious and recurrent problem for blood collection managers: in January 2019, “Nationwide, the Red Cross saw 27,000 fewer blood donations over the holidays than they see at other times of the year.” Machine learning can be used to learn the patterns in the data to help to predict future blood donations and therefore save more lives.

In this Project, we work with data collected from the donor database of Blood Transfusion Service Center in Hsin-Chu City in Taiwan. The center passes its blood transfusion service bus to one university in Hsin-Chu City to gather blood donated about every three months. The dataset, obtained from the UCI Machine Learning Repository, consists of a random sample of 748 donors. Our task will be to predict if a blood donor will donate within a given time window. We look at the full model-building process: from inspecting the dataset to using the tpot library to automate your Machine Learning pipeline.

Technologies and methods employ:
- Python
- Pandas
- Logistic regression.

[Source: DataCamp]

Predicting Credit Card Approvals

Project Description

Task: Build a machine learning model to predict if a credit card application will get approved.

Commercial banks receive a lot of applications for credit cards. Many of them get rejected for many reasons, like high loan balances, low income levels, or too many inquiries on an individual’s credit report, for example. Manually analyzing these applications is mundane, error-prone, and time-consuming (and time is money!). Luckily, this task can be automated with the power of machine learning and pretty much every commercial bank does so nowadays. In this project, we build an automatic credit card approval predictor using machine learning techniques, just like the real banks do.

Technologies and methods employ:
- Supervised Learning with scikit-learn
- Data Manipulation with pandas

The dataset used in this project is the Credit Card Approval dataset from the UCI Machine Learning Repository.

[Source: DataCamp]

Churn case study

Project Description

In this project we process data for machine learning and create predictions on a churn case study.

Main concepts implemented:
- The different types of machine learning and when to use them.
- How to apply data preprocessing for machine learning including feature engineering.
- How to apply supervised machine learning models to generate predictions!

The dataset to be used in this project is a CSV file named telco.csv, which contains data on telecom customers churning and some of their key behaviors.

Technologies and methods employ:
- Data manipulation with pandas
- Data visualization with seaborn
- Supervised Learning with scikit-learn
- Scikit-Learn classifiers: Decision Tree, Random Forest and K-Neighbors.