Skip to content
Snippets Groups Projects
Commit fb035e87 authored by buckl113's avatar buckl113 Committed by Colbry, Dirk
Browse files

Added compatability with Google Colab

parent fcdb57bc
No related branches found
No related tags found
No related merge requests found
%% Cell type:markdown id: tags:
# Classification Using Auto-SKLearn
%% Cell type:markdown id: tags:
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github.com/mcint170/DataTools_Tutorial_Demo/blob/main/Auto-SKLearn_AutoML/Classification.ipynb)
%% Cell type:code id: tags:
```
!pip install auto-sklearn
```
%% Cell type:markdown id: tags:
After running this cell, Click Runtime -> Restart runtime. Then you can run the following cells.
%% Cell type:code id: tags:
```
# imports
from pprint import pprint
import sklearn.datasets
import sklearn.metrics
import pickle
import autosklearn.classification
```
%% Cell type:code id: tags:
```
# split the dataset
X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = \
sklearn.model_selection.train_test_split(X, y, random_state=1)
```
%% Cell type:code id: tags:
```
# Fit the classifier
automl = autosklearn.classification.AutoSklearnClassifier(
time_left_for_this_task=120,
per_run_time_limit=30,
tmp_folder='/tmp/autosklearn_classification_example_tmp',
)
automl.fit(X_train, y_train, dataset_name='breast_cancer')
```
%% Output
AutoSklearnClassifier(per_run_time_limit=30, time_left_for_this_task=120,
tmp_folder='/tmp/autosklearn_classification_example_tmp')
%% Cell type:code id: tags:
```
# Different Models run by autosklearn
print(automl.leaderboard())
```
%% Output
rank ensemble_weight type cost duration
model_id
7 1 0.10 extra_trees 0.014184 1.502508
2 2 0.02 random_forest 0.028369 2.024807
36 3 0.06 k_nearest_neighbors 0.028369 0.853534
26 4 0.04 extra_trees 0.028369 2.240347
19 5 0.02 extra_trees 0.028369 2.791073
22 6 0.02 gradient_boosting 0.028369 1.149980
3 7 0.14 mlp 0.028369 1.667622
12 8 0.04 gradient_boosting 0.035461 1.240657
17 9 0.02 gradient_boosting 0.035461 1.510491
8 10 0.02 random_forest 0.035461 1.958862
37 11 0.06 gradient_boosting 0.035461 1.585859
5 12 0.04 random_forest 0.035461 2.075770
27 13 0.10 extra_trees 0.042553 1.910083
34 14 0.08 random_forest 0.042553 1.884860
9 15 0.04 extra_trees 0.042553 1.799630
23 16 0.02 mlp 0.049645 2.405247
35 17 0.06 extra_trees 0.056738 1.586217
32 18 0.02 extra_trees 0.063830 1.650489
38 19 0.02 extra_trees 0.063830 2.128083
20 20 0.02 passive_aggressive 0.078014 0.774718
30 21 0.04 adaboost 0.078014 3.121010
29 22 0.02 gaussian_nb 0.141844 1.951357
%% Cell type:code id: tags:
```
# Show the different models
pprint(automl.show_models(), indent=4)
```
%% Cell type:code id: tags:
```
# Predict the test labels
predictions = automl.predict(X_test)
print("Accuracy score:", sklearn.metrics.accuracy_score(y_test, predictions))
```
%% Output
Accuracy score: 0.9440559440559441
%% Cell type:code id: tags:
```
# Export the model with the highest rank
clf = automl.show_models()[7]['sklearn_classifier']
pickle.dump(clf,open('model.pickle','wb'))
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment