4th December,2023

Applied neural network classifier using scikit-learn’s MLPClassifier and assessed its performance through cross-validation. First, the features are extracted from the DataFrame, selecting columns related to ‘inquiry_type’, ‘how_contacted’, and ‘research_type’. Categorical variables are then encoded using one-hot encoding to convert them into a format suitable for the neural network.

1st December,2023

Used Bootstrapping that involves repeatedly resampling with replacement from the original data to create multiple bootstrap samples. In this case, for each of the 1000 iterations, a bootstrap sample is drawn from the mean service time for each ‘inquiry_type’. Then computed the mean for each bootstrap sample and stores it in the list ‘bootstrap_means’.

29th Novemebr,2023

Standardized using z-score scaling and PCA is applied to derive principal components, and the printing the explained variance ratio for each. Additionally, it visualizes the cumulative explained variance against the number of principal components, aiding in the identification of an optimal dimensionality reduction level.

27th November,2023

Created a binary variable indicating whether service time is longer than 30 minutes. The model is used to predict whether service time is longer than 30 minutes on the test set. Then evaluates the model’s accuracy and printed the result along with a confusion matrix, providing insights into the model’s performance in binary classification.

24th November,2023

I worked on the Research dataset that was subsequently divided into training and testing set, computed the average service time for each type of study, and gets ready features and target variables for a linear regression model. Then forecasted  service times on the test set and The script uses matplotlib to show the regression line and Root Mean Squared Error (RMSE) to evaluate the model. The final result is a figure that shows how the average service time is predicted by the linear regression model depending on the type of study.