Chapter 07. Python Pandas 기초(1)

Notice

Recent Posts

Recent Comments

Link

GitHub

« 2025/07 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

Nacho

Chapter 07. Python Pandas 기초(1) 본문

Python

Chapter 07. Python Pandas 기초(1)

Nacho_13 2024. 2. 23. 20:41

자 pandas 를 복습해보자.

# 라이브러리 불러오기
import pandas as pd # 아묻따 import 갈겨버리깅

데이터 프레임 (DataFrame) 생성

1. 딕셔너리를 이용한 방법

# 딕셔너리 만들기
dict1 = {'Name': ['Gildong', 'Sarang', 'Jiemae', 'Yeoin'],
        'Level': ['Gold', 'Bronze', 'Silver', 'Gold'],
        'Score': [56000, 23000, 44000, 52000]}

df = pd.DataFrame(dict1)

Output:

	NAME	Level	Score
0	Gildong	Gold	56000
1	Sarang	Bronze	23000
2	Jiemae	Silver	44000
3	Yeoin	Gold	52000

2. csv 파일을 읽어오는 방법

# 데이터 읽어오기
path = 'https://raw.githubusercontent.com/DA4BAM/dataset/master/titanic_simple.csv'
df = pd.read_csv(path)  

# 상위 5행만 확인
df.head()

Output

	PassengerId	Survived	Pclass	Name	Sex	Age	Fare	Embarked
0	1	0	3	Braund, Mr. Owen Harris	male	22	7.25	Southampton
1	2	1	1	Cumings, Mrs. John Bradley (Florence Briggs Thayer)	female	38	71.2833	Cherbourg
2	3	1	3	Heikkinen, Miss. Laina	female	26	7.925	Southampton
3	4	1	1	Futrelle, Mrs. Jacques Heath (Lily May Peel)	female	35	53.1	Southampton
4	5	0	3	Allen, Mr. William Henry	male	35	8.05	Southampton

데이터 프레임 (DataFrame) 속성 확인

pd.DataFrame.info()

# 열 데이터 형식 확인
df.info
'''
Output:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 891 entries, 0 to 890
Data columns (total 8 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   PassengerId  891 non-null    int64  
 1   Survived     891 non-null    int64  
 2   Pclass       891 non-null    int64  
 3   Name         891 non-null    object 
 4   Sex          891 non-null    object 
 5   Age          714 non-null    float64
 6   Fare         891 non-null    float64
 7   Embarked     889 non-null    object 
dtypes: float64(2), int64(3), object(3)
memory usage: 55.8+ KB
'''

pd.DataFrame.columns

# 열 정보 확인
df.columns
'''
Output:
Index(['PassengerId', 'Survived', 'Pclass', 'Name', 'Sex', 'Age', 'Fare',
       'Embarked'],
      dtype='object')
'''

pd.DataFrame.describe()

# 기초통계정보 확인
df.describe()

Output:

	PassengerId	Survived	Pclass	Age	Fare
count	891	891	891	714	891
mean	446	0.383838	2.30864	29.6991	32.2042
std	257.354	0.486592	0.836071	14.5265	49.6934
min	1	0	1	0.42	0
25%	223.5	0	2	20.125	7.9104
50%	446	0	3	28	14.4542
75%	668.5	1	3	38	31
max	891	1	3	80	512.329

pd.DataFrame.describe().T

# 기초통계정보 확인
df.describe()

Output:

	count	mean	std	min	25%	50%	75%	max
PassengerId	891	446	257.354	1	223.5	446	668.5	891
Survived	891	0.383838	0.486592	0	0	0	1	1
Pclass	891	2.30864	0.836071	1	2	3	3	3
Age	714	29.6991	14.5265	0.42	20.125	28	38	80
Fare	891	32.2042	49.6934	0	7.9104	14.4542	31	512.329

pd.DataFrame.value_counts()

# 고유값 개수 확인
df[['Embarked','Pclass']].value_counts()
'''
Output:
Embarked     Pclass
Southampton  3         353
             2         164
             1         127
Cherbourg    1          85
Queenstown   3          72
Cherbourg    3          66
             2          17
Queenstown   2           3
             1           2
dtype: int64
'''

pd.DataFrame.sort_values()

#정렬
df.sort_values(by='Fare',ascending=False).head(10)

ascending = False : 내림차순 ,True : 오름차순 (default)

'Python' 카테고리의 다른 글

Pandas 데이터프레임 Concat(), Merge() (0)	2024.02.26
Chapter 08. Python Pandas 기초(2) (0)	2024.02.23
Chapter 06. Python Numpy 기초 (0)	2024.02.23
Chapter 04. Python 프로그래밍 & 라이브러리 (0)	2024.02.22
Chapter 03. Python 프로그래밍 & 라이브러리 (0)	2024.02.22

'Python' Related Articles

Nacho

Chapter 07. Python Pandas 기초(1) 본문

Chapter 07. Python Pandas 기초(1)

자 pandas 를 복습해보자.

데이터 프레임 (DataFrame) 생성

데이터 프레임 (DataFrame) 속성 확인

pd.DataFrame.info()

pd.DataFrame.columns

pd.DataFrame.describe()

pd.DataFrame.describe().T

pd.DataFrame.value_counts()

pd.DataFrame.sort_values()

'Python' 카테고리의 다른 글

티스토리툴바