Skip to content
Snippets Groups Projects

Pandas

Merged ande2472 requested to merge ande2472-main-patch-29109 into main
1 file
+ 244
0
Compare changes
  • Side-by-side
  • Inline
Pandas.ipynb 0 → 100644
+ 244
0
%% Cell type:code id:fc73d74c tags:
``` python
##ANSWER##
#Install answercheck in current director
from urllib.request import urlretrieve
urlretrieve('https://raw.githubusercontent.com/colbrydi/jupytercheck/master/answercheck.py', filename='answercheck.py')
##ANSWER##
```
%% Output
('answercheck.py', <http.client.HTTPMessage at 0x7fc9a8612220>)
%% Cell type:markdown id:3c2a4f39 tags:
# Pandas
Pandas is a python library used to analyze data.
%% Cell type:markdown id:2b78245b tags:
![Giant Panda Bear Eating Apples - Image found on Flickr](https://live.staticflickr.com/3296/2679911705_33e5b2db7d_b.jpg)
%% Cell type:markdown id:71e867d9 tags:
## Description
Pandas take tabular data and store them in objects known as ‘databases’. They are useful for exploring, cleaning, and processing data
%% Cell type:markdown id:58ec20e8 tags:
## Self Assessment
Questions that test for the learning goals and allows students to evaluate if they truly understand the topics.
%% Cell type:markdown id:5d237ca9 tags:
## Training Materials
%% Cell type:markdown id:21f9b14e tags:
For a general understanding of how to use pandas, along with practice exercises, check out the following website: https://www.w3schools.com/python/pandas/default.asp
For a complete directory of everything you can do with pandas:
​​
https://pandas.pydata.org/docs/index.html
Pandas Cheat Sheet:
Data Analysis:
https://drive.google.com/file/d/1UHK8wtWbADvHKXFC937IS6MTnlSZC_zB/view
Wrangling Data:
https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf
%% Cell type:markdown id:1b047470 tags:
Go to https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html and download the “grades.cvs” file. We will use it to solve the following problems:
&#9989; **<span style="color:red">Question:</span>** Create a dataframe variable to store the grades.cvs table.
If you need help: https://www.w3schools.com/python/pandas/pandas_csv.asp
%% Cell type:code id:ee42a54e tags:
``` python
##ANSWER##
df = pd.read_csv('grades.csv')
df
##ANSWER##
```
%% Output
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [1], in <cell line: 2>()
1 ##ANSWER##
----> 2 df = pd.read_csv('grades.csv')
3 df
NameError: name 'pd' is not defined
%% Cell type:markdown id:1e046af0 tags:
&#9989; **<span style="color:red">Question:</span>** Remove the 9th row (Andrew Airpump’s data) from the dataframe.
If you need help, refer to the Wrangling Data cheatsheet: https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf
%% Cell type:code id:180aff55 tags:
``` python
##ANSWER##
df_1 = df.drop([8])
df_1
##ANSWER##
```
%% Output
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [1], in <cell line: 2>()
1 ##ANSWER##
----> 2 df = pd.read_csv('grades.csv')
3 df
NameError: name 'pd' is not defined
%% Cell type:markdown id:43219d26 tags:
&#9989; **<span style="color:red">Question:</span>** Clean up column headers to remove unnecessary quotation marks.
If you need help: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rename.html
%% Cell type:code id:4c9f8f12 tags:
``` python
##ANSWER##
df_2 = df_1.rename(columns=[{'\"First name\"': 'First name'}])
df_2
##ANSWER##
```
%% Output
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [1], in <cell line: 2>()
1 ##ANSWER##
----> 2 df = pd.read_csv('grades.csv')
3 df
NameError: name 'pd' is not defined
%% Cell type:markdown id:f4e4a841 tags:
&#9989; **<span style="color:red">Question:</span>** Sort the data by student’s SSN. Store this data in a new dataframe called df_3.
If you need help: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html?highlight=sort_values#pandas.DataFrame.sort_values
%% Cell type:code id:b8c5166c tags:
``` python
##ANSWER##
df_3 = df_2.sortvalues(by='Final')
df_3
##ANSWER##
```
%% Output
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Input In [1], in <cell line: 2>()
1 ##ANSWER##
----> 2 df = pd.read_csv('grades.csv')
3 df
NameError: name 'pd' is not defined
%% Cell type:markdown id:11a12621 tags:
&#9989; **<span style="color:red">Question:</span>** Sort the data by the overall letter grade they received in the class and the grade they received on the final.
If you need help: Same as above
%% Cell type:markdown id:333f476e tags:
&#9989; **<span style="color:red">Question:</span>** Create a line graph plotting Test 1 scores against Test 2 scores.
If you need help: https://pandas.pydata.org/docs/user_guide/visualization.html
%% Cell type:markdown id:f66f2294 tags:
&#9989; **<span style="color:red">Question:</span>** Perform a correlation test comparing Test 3 and Test 4 scores.
If you need help: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.corr.html
%% Cell type:markdown id:dd9c7fa5 tags:
&#9989; **<span style="color:red">Question:</span>** Example answercheck question: What is $x = 2+2$?
%% Cell type:code id:64684dfd tags:
``` python
#Put your answer here
```
%% Cell type:code id:619a2259 tags:
``` python
##ANSWER##
x = 4
##ANSWER##
```
%% Cell type:code id:9cc2b34a tags:
``` python
from answercheck import checkanswer
checkanswer.vector(x,'2cab95d1b144d663bad1ce5c51020ae0')
```
%% Output
CheckWarning: passed variable is <class 'int'> and not a numpy.matrix.
Trying to convert to a array matrix using ```A = np.matrix(A)```.
CheckWarning: passed matrix is int64 and not <class 'numpy.float64'>...
Trying to convert to float using ```A = A.astype(float)```.
Testing [[4.]]
Answer seems to be correct
%% Cell type:markdown id:44b461a0 tags:
---
Written by <<YOUR NAME HERE>>, Michigan State University
As part of the Data Science Bridge Project
<a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc/4.0/">Creative Commons Attribution-NonCommercial 4.0 International License</a>.
Loading