Skip to content
Snippets Groups Projects

Pandas

Merged ande2472 requested to merge ande2472-main-patch-29109 into main
1 file
+ 244
0
Compare changes
  • Side-by-side
  • Inline
Pandas.ipynb 0 → 100644
+ 381
0
{
"cells": [
{
"cell_type": "code",
"execution_count": 32,
"id": "fc73d74c",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"('answercheck.py', <http.client.HTTPMessage at 0x7fc9a8612220>)"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"##ANSWER##\n",
"#Install answercheck in current director\n",
"from urllib.request import urlretrieve\n",
"urlretrieve('https://raw.githubusercontent.com/colbrydi/jupytercheck/master/answercheck.py', filename='answercheck.py')\n",
"##ANSWER##"
]
},
{
"cell_type": "markdown",
"id": "3c2a4f39",
"metadata": {},
"source": [
"# Pandas\n",
"Pandas is a python library used to analyze data."
]
},
{
"cell_type": "markdown",
"id": "2b78245b",
"metadata": {},
"source": [
"![Giant Panda Bear Eating Apples - Image found on Flickr](https://live.staticflickr.com/3296/2679911705_33e5b2db7d_b.jpg)\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "71e867d9",
"metadata": {},
"source": [
"## Description\n",
"Pandas take tabular data and store them in objects known as ‘databases’. They are useful for exploring, cleaning, and processing data\n"
]
},
{
"cell_type": "markdown",
"id": "58ec20e8",
"metadata": {},
"source": [
"## Self Assessment\n",
"\n",
"Questions that test for the learning goals and allows students to evaluate if they truly understand the topics."
]
},
{
"cell_type": "markdown",
"id": "5d237ca9",
"metadata": {},
"source": [
"## Training Materials\n",
"\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "21f9b14e",
"metadata": {},
"source": [
"For a general understanding of how to use pandas, along with practice exercises, check out the following website: https://www.w3schools.com/python/pandas/default.asp\n",
"\n",
"For a complete directory of everything you can do with pandas:\n",
"​​\n",
"https://pandas.pydata.org/docs/index.html\n",
"\n",
"Pandas Cheat Sheet:\n",
"\n",
"Data Analysis:\n",
"\thttps://drive.google.com/file/d/1UHK8wtWbADvHKXFC937IS6MTnlSZC_zB/view \n",
"\n",
"Wrangling Data:\n",
"https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "1b047470",
"metadata": {},
"source": [
"Go to https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html and download the “grades.cvs” file. We will use it to solve the following problems:\n",
"\n",
"&#9989; **<span style=\"color:red\">Question:</span>** Create a dataframe variable to store the grades.cvs table.\n",
"\n",
"\t\n",
"\n",
"If you need help: https://www.w3schools.com/python/pandas/pandas_csv.asp\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "ee42a54e",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'pd' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Input \u001b[0;32mIn [1]\u001b[0m, in \u001b[0;36m<cell line: 2>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m##ANSWER##\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m df \u001b[38;5;241m=\u001b[39m \u001b[43mpd\u001b[49m\u001b[38;5;241m.\u001b[39mread_csv(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mgrades.csv\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[1;32m 3\u001b[0m df\n",
"\u001b[0;31mNameError\u001b[0m: name 'pd' is not defined"
]
}
],
"source": [
"##ANSWER##\n",
"df = pd.read_csv('grades.csv')\n",
"df\n",
"##ANSWER##"
]
},
{
"cell_type": "markdown",
"id": "1e046af0",
"metadata": {},
"source": [
"&#9989; **<span style=\"color:red\">Question:</span>** Remove the 9th row (Andrew Airpump’s data) from the dataframe.\n",
"\n",
"If you need help, refer to the Wrangling Data cheatsheet: https://pandas.pydata.org/Pandas_Cheat_Sheet.pdf\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "180aff55",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'pd' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Input \u001b[0;32mIn [1]\u001b[0m, in \u001b[0;36m<cell line: 2>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m##ANSWER##\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m df \u001b[38;5;241m=\u001b[39m \u001b[43mpd\u001b[49m\u001b[38;5;241m.\u001b[39mread_csv(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mgrades.csv\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[1;32m 3\u001b[0m df\n",
"\u001b[0;31mNameError\u001b[0m: name 'pd' is not defined"
]
}
],
"source": [
"##ANSWER##\n",
"df_1 = df.drop([8])\n",
"df_1\n",
"\n",
"##ANSWER##"
]
},
{
"cell_type": "markdown",
"id": "43219d26",
"metadata": {},
"source": [
"&#9989; **<span style=\"color:red\">Question:</span>** Clean up column headers to remove unnecessary quotation marks.\n",
"\n",
"\n",
"If you need help: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rename.html\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "4c9f8f12",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'pd' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Input \u001b[0;32mIn [1]\u001b[0m, in \u001b[0;36m<cell line: 2>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m##ANSWER##\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m df \u001b[38;5;241m=\u001b[39m \u001b[43mpd\u001b[49m\u001b[38;5;241m.\u001b[39mread_csv(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mgrades.csv\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[1;32m 3\u001b[0m df\n",
"\u001b[0;31mNameError\u001b[0m: name 'pd' is not defined"
]
}
],
"source": [
"##ANSWER##\n",
"df_2 = df_1.rename(columns=[{'\\\"First name\\\"': 'First name'}])\n",
"\tdf_2\n",
"\n",
"\n",
"##ANSWER##"
]
},
{
"cell_type": "markdown",
"id": "f4e4a841",
"metadata": {},
"source": [
"&#9989; **<span style=\"color:red\">Question:</span>** Sort the data by student’s SSN. Store this data in a new dataframe called df_3.\n",
"\n",
"\n",
"If you need help: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html?highlight=sort_values#pandas.DataFrame.sort_values\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "b8c5166c",
"metadata": {},
"outputs": [
{
"ename": "NameError",
"evalue": "name 'pd' is not defined",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mNameError\u001b[0m Traceback (most recent call last)",
"Input \u001b[0;32mIn [1]\u001b[0m, in \u001b[0;36m<cell line: 2>\u001b[0;34m()\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[38;5;66;03m##ANSWER##\u001b[39;00m\n\u001b[0;32m----> 2\u001b[0m df \u001b[38;5;241m=\u001b[39m \u001b[43mpd\u001b[49m\u001b[38;5;241m.\u001b[39mread_csv(\u001b[38;5;124m'\u001b[39m\u001b[38;5;124mgrades.csv\u001b[39m\u001b[38;5;124m'\u001b[39m)\n\u001b[1;32m 3\u001b[0m df\n",
"\u001b[0;31mNameError\u001b[0m: name 'pd' is not defined"
]
}
],
"source": [
"##ANSWER##\n",
"df_3 = df_2.sortvalues(by='Final')\n",
"\tdf_3 \n",
"\n",
"\n",
"\n",
"##ANSWER##"
]
},
{
"cell_type": "markdown",
"id": "11a12621",
"metadata": {},
"source": [
"&#9989; **<span style=\"color:red\">Question:</span>** Sort the data by the overall letter grade they received in the class and the grade they received on the final.\n",
"\n",
"\n",
"If you need help: Same as above\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "333f476e",
"metadata": {},
"source": [
"&#9989; **<span style=\"color:red\">Question:</span>** Create a line graph plotting Test 1 scores against Test 2 scores.\n",
"\n",
"\n",
"\n",
"If you need help: https://pandas.pydata.org/docs/user_guide/visualization.html\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "f66f2294",
"metadata": {},
"source": [
"&#9989; **<span style=\"color:red\">Question:</span>** Perform a correlation test comparing Test 3 and Test 4 scores.\n",
"\n",
"\n",
"If you need help: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.corr.html\n",
"\n"
]
},
{
"cell_type": "markdown",
"id": "dd9c7fa5",
"metadata": {},
"source": [
"&#9989; **<span style=\"color:red\">Question:</span>** Example answercheck question: What is $x = 2+2$?\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "64684dfd",
"metadata": {},
"outputs": [],
"source": [
"#Put your answer here"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "619a2259",
"metadata": {},
"outputs": [],
"source": [
"##ANSWER##\n",
"x = 4\n",
"##ANSWER##"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "9cc2b34a",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"CheckWarning: passed variable is <class 'int'> and not a numpy.matrix.\n",
" Trying to convert to a array matrix using ```A = np.matrix(A)```.\n",
"\n",
"\n",
"CheckWarning: passed matrix is int64 and not <class 'numpy.float64'>...\n",
" Trying to convert to float using ```A = A.astype(float)```.\n",
"\n",
"Testing [[4.]]\n",
"Answer seems to be correct\n",
"\n"
]
}
],
"source": [
"from answercheck import checkanswer\n",
"checkanswer.vector(x,'2cab95d1b144d663bad1ce5c51020ae0')"
]
},
{
"cell_type": "markdown",
"id": "44b461a0",
"metadata": {},
"source": [
"---\n",
"\n",
"Written by <<YOUR NAME HERE>>, Michigan State University \n",
"As part of the Data Science Bridge Project \n",
" \n",
"<a rel=\"license\" href=\"http://creativecommons.org/licenses/by-nc/4.0/\"><img alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https://i.creativecommons.org/l/by-nc/4.0/88x31.png\" /></a><br />This work is licensed under a <a rel=\"license\" href=\"http://creativecommons.org/licenses/by-nc/4.0/\">Creative Commons Attribution-NonCommercial 4.0 International License</a>."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.12"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading