728x90
반응형
SMALL
타이타닉 생존자 분석
https://www.kaggle.com/datasets/tedllh/titanic-train에서 csv 파일을 다운한다.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
titanic_df = pd.read_csv('titanic_train.csv')
titanic_df
titanic_df['Survived'].groupby(titanic_df['Sex']).mean()
Sex
female 0.742038
male 0.188908
Name: Survived, dtype: float64
titanic_df.pivot_table(index=['Sex'])['Survived']
Sex
female 0.742038
male 0.188908
Name: Survived, dtype: float64
titanic_df.pivot_table(index=['Pclass'], aggfunc=np.sum)['Survived']
Pclass
1 136
2 87
3 119
Name: Survived, dtype: int64
titanic_df['Survived'].groupby(titanic_df['Pclass']).sum()
Pclass
1 136
2 87
3 119
Name: Survived, dtype: int64
ages = []
for index, row in titanic_df.iterrows():
ages.append((row['Age']//10) * 10)
titanic_df['ages'] = ages
titanic_df['Survived'].groupby(titanic_df['ages']).sum().sort_values(ascending=False)
ages
20.0 77
30.0 73
10.0 41
0.0 38
40.0 34
50.0 20
60.0 6
80.0 1
70.0 0
Name: Survived, dtype: int64
titanic_df['Survived'].groupby(titanic_df['ages']).sum().plot(kind='bar')
def age_to_ages(df):
return (df['Age']//10) * 10
titanic_df.apply(age_to_ages, axis=1)
0 20.0
1 30.0
2 20.0
3 30.0
4 30.0
...
886 20.0
887 10.0
888 NaN
889 20.0
890 30.0
Length: 891, dtype: float64
728x90
반응형
LIST
'Python Library > Pandas' 카테고리의 다른 글
[Pandas] rolling (0) | 2023.03.30 |
---|---|
[Pandas] 데이터프레임 만들기 (0) | 2022.10.26 |
[Pandas] Iris (붓꽃) (0) | 2022.10.25 |
[Pandas] 시각화 (0) | 2022.10.23 |
[Pandas] HTML 파일에서 데이터 입출력 (0) | 2022.10.21 |