Finished button

8a2dfa70 · buckl113 · Colbry, Dirk · fd7e7f7d · 8a2dfa70
Commit 8a2dfa70 authored 3 years ago by buckl113 Committed by Colbry, Dirk 2 years ago
--- a/Auto-SKLearn_AutoML/Classification.ipynb
+++ b/Auto-SKLearn_AutoML/Classification.ipynb
@@ -27,7 +27,7 @@
    {
      "cell_type": "markdown",
      "source": [
-        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Z5s0WXnjtxSi2oLxKG1ZTTcpVXqIjLyv#scrollTo=-ZrgwiL9kR_L)"
+        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mcint170/DataTools_Tutorial_Demo/blob/main/Auto-SKLearn_AutoML/Classification.ipynb)"
      ],
      "metadata": {
        "id": "-ZrgwiL9kR_L"

 %% Cell type:markdown id: tags:
 # Classification Using Auto-SKLearn
 %% Cell type:markdown id: tags:
-[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1Z5s0WXnjtxSi2oLxKG1ZTTcpVXqIjLyv#scrollTo=-ZrgwiL9kR_L)
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mcint170/DataTools_Tutorial_Demo/blob/main/Auto-SKLearn_AutoML/Classification.ipynb)
 %% Cell type:code id: tags:
 ``` 
 !pip install auto-sklearn
 ```
 %% Cell type:markdown id: tags:
 If running on Google Colab: After running this cell, Click Runtime -> Restart runtime. Then you can run the following cells.
 %% Cell type:code id: tags:
 ``` 
 # imports
 from pprint import pprint
 import sklearn.datasets
 import sklearn.metrics
 import pickle
 import autosklearn.classification
 ```
 %% Cell type:code id: tags:
 ``` 
 # split the dataset
 X, y = sklearn.datasets.load_breast_cancer(return_X_y=True)
 X_train, X_test, y_train, y_test = \
    sklearn.model_selection.train_test_split(X, y, random_state=1)
 ```
 %% Cell type:code id: tags:
 ``` 
 # Fit the classifier
 automl = autosklearn.classification.AutoSklearnClassifier(
    time_left_for_this_task=120,
    per_run_time_limit=30,
    tmp_folder='/tmp/autosklearn_classification_example_tmp',
 )
 automl.fit(X_train, y_train, dataset_name='breast_cancer')
 ```
 %% Output
    AutoSklearnClassifier(per_run_time_limit=30, time_left_for_this_task=120,
                          tmp_folder='/tmp/autosklearn_classification_example_tmp')
 %% Cell type:code id: tags:
 ``` 
 # Different Models run by autosklearn
 print(automl.leaderboard())
 ```
 %% Output
              rank  ensemble_weight                 type      cost  duration
    model_id
    7            1             0.10          extra_trees  0.014184  1.502508
    2            2             0.02        random_forest  0.028369  2.024807
    36           3             0.06  k_nearest_neighbors  0.028369  0.853534
    26           4             0.04          extra_trees  0.028369  2.240347
    19           5             0.02          extra_trees  0.028369  2.791073
    22           6             0.02    gradient_boosting  0.028369  1.149980
    3            7             0.14                  mlp  0.028369  1.667622
    12           8             0.04    gradient_boosting  0.035461  1.240657
    17           9             0.02    gradient_boosting  0.035461  1.510491
    8           10             0.02        random_forest  0.035461  1.958862
    37          11             0.06    gradient_boosting  0.035461  1.585859
    5           12             0.04        random_forest  0.035461  2.075770
    27          13             0.10          extra_trees  0.042553  1.910083
    34          14             0.08        random_forest  0.042553  1.884860
    9           15             0.04          extra_trees  0.042553  1.799630
    23          16             0.02                  mlp  0.049645  2.405247
    35          17             0.06          extra_trees  0.056738  1.586217
    32          18             0.02          extra_trees  0.063830  1.650489
    38          19             0.02          extra_trees  0.063830  2.128083
    20          20             0.02   passive_aggressive  0.078014  0.774718
    30          21             0.04             adaboost  0.078014  3.121010
    29          22             0.02          gaussian_nb  0.141844  1.951357
 %% Cell type:code id: tags:
 ``` 
 # Show the different models
 pprint(automl.show_models(), indent=4)
 ```
 %% Cell type:code id: tags:
 ``` 
 # Predict the test labels
 predictions = automl.predict(X_test)
 print("Accuracy score:", sklearn.metrics.accuracy_score(y_test, predictions))
 ```
 %% Output
    Accuracy score: 0.9440559440559441
 %% Cell type:code id: tags:
 ``` 
 # Export the model with the highest rank
 clf = automl.show_models()[7]['sklearn_classifier']
 pickle.dump(clf,open('model.pickle','wb'))
 ```