미니 프로젝트 문제 풀이
미니 프로젝트 1 문제
1 | 1. tmdb_5000_movies 데이터 셋 분석 |
1,2번 문제
1 | import pandas as pd |
1 | t_df = pd.read_csv('../Downloads/tmdb_5000_movies.csv') |
1 | t_df.drop(columns=['homepage','tagline','status'],inplace=True) |
1 | # json 열을 name만 추출하여 전처리, keywords는 공백이 유의미한 단위이므로 따로 처리 |
1 | # 1) 예산과 장르 관계? |
budget | ... | vote_count | |||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
genres | Action | Action Adventure | Action Adventure Animation Comedy Family | Action Adventure Animation Comedy Family Fantasy Romance | Action Adventure Animation Comedy Family Fantasy Science Fiction | Action Adventure Animation Comedy Science Fiction | Action Adventure Animation Family | Action Adventure Animation Family Fantasy | Action Adventure Animation Fantasy Science Fiction | Action Adventure Animation Science Fiction Thriller | Action Adventure Comedy | Action Adventure Comedy Crime | Action Adventure Comedy Crime Drama | Action Adventure Comedy Crime Mystery Thriller | Action Adventure Comedy Crime Romance Thriller | Action Adventure Comedy Crime Thriller | Action Adventure Comedy Drama Family Music Romance | Action Adventure Comedy Drama Foreign | Action Adventure Comedy Drama Mystery | Action Adventure Comedy Drama Science Fiction Thriller | Action Adventure Comedy Family | Action Adventure Comedy Family Fantasy | Action Adventure Comedy Family Fantasy Science Fiction | Action Adventure Comedy Family Science Fiction | ... | War | War Action | War Action Adventure Drama Thriller | War Action Drama History Thriller | War Adventure Drama Romance | War Comedy Drama | War Crime Drama Mystery Romance Thriller | War Drama | War Drama Action | War Drama History | War Drama History Action | War Drama History Action Romance | War Drama Romance | War History Action Adventure Drama Romance | War History Drama | War Western | Western | Western Action Drama History | Western Adventure | Western Animation Adventure Comedy Family | Western Comedy | Western Drama | Western Drama Adventure Thriller | Western History | Western History War | |
budget_cat | |||||||||||||||||||||||||||||||||||||||||||||||||||
(-0.001, 7500000.0] | 27 | 6 | 1 | 0 | 1 | 0 | 1 | 0 | 0 | 1 | 0 | 3 | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | ... | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 2 | 0 | 0 | 1 | 0 | 0 | 1 | 15 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
(7500000.0, 22000000.0] | 0 | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 2 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | ... | 1 | 0 | 0 | 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 2 | 0 | 0 | 0 | 0 | 2 | 0 | 1 | 0 |
(22000000.0, 50000000.0] | 0 | 7 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 1 | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | 1 | ... | 0 | 0 | 0 | 1 | 2 | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
(50000000.0, 380000000.0] | 0 | 5 | 7 | 2 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 8 | 0 | 0 | 1 | 1 | 3 | 0 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | ... | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | 0 | 1 |
4 rows × 18800 columns
1 | # 2) 키워드로 많이 사용된 단어는? |
[('independent film', 192), ('murder', 172), ('violence', 146), ('dystopia', 120), ('duringcreditsstinger', 116)]
1 | # 3) 장르와 키워드 관계는? |
title | Avatar | Pirates of the Caribbean: At World's End | Spectre | The Dark Knight Rises | John Carter | Spider-Man 3 | Tangled | Avengers: Age of Ultron | Harry Potter and the Half-Blood Prince | Batman v Superman: Dawn of Justice | Superman Returns | Quantum of Solace | Pirates of the Caribbean: Dead Man's Chest | The Lone Ranger | Man of Steel | The Chronicles of Narnia: Prince Caspian | The Avengers | Pirates of the Caribbean: On Stranger Tides | Men in Black 3 | The Hobbit: The Battle of the Five Armies | The Amazing Spider-Man | Robin Hood | The Hobbit: The Desolation of Smaug | The Golden Compass | King Kong | ... | Rampage | Slacker | Dutch Kills | Dry Spell | Flywheel | Backmask | The Puffy Chair | Stories of Our Lives | Breaking Upwards | All Superheroes Must Die | Pink Flamingos | Clean | The Circle | Tin Can Man | Cure | On The Downlow | Sanctuary: Quite a Conundrum | Bang | Primer | Cavite | El Mariachi | Newlyweds | Signed, Sealed, Delivered | Shanghai Calling | My Date with Drew |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
title | |||||||||||||||||||||||||||||||||||||||||||||||||||
Avatar | 1.000000 | 0.033911 | 0.017435 | 0.004221 | 0.248501 | 0.035139 | 0.024894 | 0.059988 | 0.023661 | 0.022763 | 0.043666 | 0.022542 | 0.022596 | 0.010072 | 0.088634 | 0.043253 | 0.077686 | 0.102427 | 0.143133 | 0.092661 | 0.021569 | 0.029362 | 0.045073 | 0.022733 | 0.013829 | ... | 0.010297 | 0.008659 | 0.000000 | 0.061893 | 0.000000 | 0.000000 | 0.023269 | 0.000000 | 0.031020 | 0.074724 | 0.030946 | 0.000000 | 0.000000 | 0.0 | 0.000000 | 0.012150 | 0.000000 | 0.000000 | 0.048523 | 0.000000 | 0.006793 | 0.056478 | 0.026349 | 0.0 | 0.0 |
Pirates of the Caribbean: At World's End | 0.033911 | 1.000000 | 0.021037 | 0.034132 | 0.012511 | 0.111809 | 0.011378 | 0.017092 | 0.044458 | 0.027466 | 0.029152 | 0.027200 | 0.474144 | 0.012153 | 0.029551 | 0.068232 | 0.038306 | 0.208616 | 0.008180 | 0.028966 | 0.037160 | 0.029612 | 0.030120 | 0.027430 | 0.142986 | ... | 0.012424 | 0.000000 | 0.000000 | 0.000000 | 0.015401 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.018194 | 0.038199 | 0.000000 | 0.0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.008197 | 0.000000 | 0.021293 | 0.0 | 0.0 |
Spectre | 0.017435 | 0.021037 | 1.000000 | 0.078061 | 0.070146 | 0.060568 | 0.030749 | 0.111236 | 0.017189 | 0.057658 | 0.103059 | 0.549934 | 0.022014 | 0.018065 | 0.062035 | 0.076458 | 0.062227 | 0.023734 | 0.012160 | 0.023387 | 0.021013 | 0.024623 | 0.018135 | 0.096466 | 0.024804 | ... | 0.045032 | 0.000000 | 0.097970 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.012025 | 0.000000 | 0.000000 | 0.0 | 0.070458 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.123730 | 0.000000 | 0.000000 | 0.0 | 0.0 |
The Dark Knight Rises | 0.004221 | 0.034132 | 0.078061 | 1.000000 | 0.004502 | 0.051288 | 0.024913 | 0.062652 | 0.000000 | 0.180473 | 0.109040 | 0.109994 | 0.005329 | 0.004373 | 0.150406 | 0.023430 | 0.059022 | 0.005746 | 0.065193 | 0.005662 | 0.082179 | 0.005961 | 0.000000 | 0.000000 | 0.012210 | ... | 0.061043 | 0.079586 | 0.186559 | 0.000000 | 0.007272 | 0.019680 | 0.020162 | 0.094137 | 0.000000 | 0.120491 | 0.013621 | 0.010274 | 0.027136 | 0.0 | 0.019153 | 0.004788 | 0.032963 | 0.010261 | 0.093245 | 0.023555 | 0.033142 | 0.000000 | 0.007541 | 0.0 | 0.0 |
John Carter | 0.248501 | 0.012511 | 0.070146 | 0.004502 | 1.000000 | 0.012965 | 0.049956 | 0.088164 | 0.010223 | 0.034291 | 0.154023 | 0.024046 | 0.013092 | 0.032782 | 0.151277 | 0.060955 | 0.105641 | 0.073203 | 0.119286 | 0.063651 | 0.012497 | 0.031320 | 0.078151 | 0.057371 | 0.014752 | ... | 0.010983 | 0.009237 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.079708 | 0.000000 | 0.000000 | 0.000000 | 0.0 | 0.034295 | 0.012961 | 0.000000 | 0.000000 | 0.051759 | 0.000000 | 0.007246 | 0.000000 | 0.000000 | 0.0 | 0.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
El Mariachi | 0.006793 | 0.008197 | 0.123730 | 0.033142 | 0.007246 | 0.008494 | 0.000000 | 0.009899 | 0.000000 | 0.008641 | 0.009172 | 0.053005 | 0.008578 | 0.007039 | 0.009297 | 0.000000 | 0.009326 | 0.009248 | 0.011084 | 0.009113 | 0.008188 | 0.009594 | 0.000000 | 0.000000 | 0.009665 | ... | 0.056640 | 0.000000 | 0.146807 | 0.000000 | 0.000000 | 0.031677 | 0.000000 | 0.000000 | 0.000000 | 0.032542 | 0.010962 | 0.000000 | 0.000000 | 0.0 | 0.019168 | 0.000000 | 0.053056 | 0.000000 | 0.009552 | 0.037913 | 1.000000 | 0.000000 | 0.000000 | 0.0 | 0.0 |
Newlyweds | 0.056478 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.053283 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.051092 | 0.000000 | 0.332759 | 0.000000 | 0.000000 | 0.411991 | 0.000000 | 0.549240 | 0.000000 | 0.036634 | 0.000000 | 0.000000 | 0.0 | 0.000000 | 0.000000 | 0.275376 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 | 0.154088 | 0.0 | 0.0 |
Signed, Sealed, Delivered | 0.026349 | 0.021293 | 0.000000 | 0.007541 | 0.000000 | 0.022065 | 0.000000 | 0.064265 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.060542 | 0.000000 | 0.008210 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.005281 | ... | 0.009198 | 0.007873 | 0.033924 | 0.051274 | 0.006188 | 0.000000 | 0.080640 | 0.080105 | 0.084631 | 0.000000 | 0.005645 | 0.008742 | 0.023091 | 0.0 | 0.042630 | 0.004074 | 0.042432 | 0.008732 | 0.005635 | 0.000000 | 0.000000 | 0.154088 | 1.000000 | 0.0 | 0.0 |
Shanghai Calling | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.0 | 0.0 |
My Date with Drew | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.132800 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | ... | 0.000000 | 0.044517 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.0 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 0.0 | 1.0 |
4799 rows × 4799 columns
1 | #4) 평균 평점과 장르 사이의 관계? |
action | adventure | animation | comedy | crime | documentary | drama | family | fantasy | fiction | foreign | history | horror | movie | music | mystery | romance | science | thriller | tv | war | western | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
vote_average | ||||||||||||||||||||||
1 | 266 | 149 | 31 | 427 | 95 | 19 | 273 | 122 | 99 | 137 | 7 | 10 | 203 | 4 | 22 | 59 | 152 | 137 | 288 | 4 | 7 | 15 |
2 | 291 | 203 | 54 | 458 | 146 | 9 | 416 | 124 | 109 | 122 | 3 | 21 | 130 | 1 | 32 | 70 | 207 | 122 | 309 | 1 | 19 | 12 |
3 | 279 | 162 | 45 | 397 | 190 | 13 | 503 | 102 | 83 | 104 | 5 | 44 | 95 | 1 | 39 | 102 | 192 | 104 | 321 | 1 | 29 | 13 |
4 | 173 | 144 | 51 | 267 | 139 | 24 | 578 | 92 | 67 | 98 | 13 | 54 | 55 | 1 | 50 | 55 | 200 | 98 | 200 | 1 | 36 | 16 |
5 | 145 | 132 | 53 | 173 | 126 | 43 | 526 | 73 | 66 | 74 | 6 | 68 | 36 | 1 | 42 | 62 | 143 | 74 | 156 | 1 | 53 | 26 |
1 | # 5) 연도별로 많이 제작된 영화 장르는? |
action | adventure | animation | comedy | crime | documentary | drama | family | fantasy | fiction | foreign | history | horror | movie | music | mystery | romance | science | thriller | tv | war | western | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
cat_year | ||||||||||||||||||||||
1 | 135 | 69 | 12 | 203 | 45 | 14 | 136 | 61 | 42 | 76 | 2 | 2 | 114 | 2 | 12 | 29 | 69 | 76 | 145 | 2 | 3 | 10 |
2 | 131 | 80 | 19 | 224 | 50 | 5 | 137 | 61 | 57 | 61 | 5 | 8 | 89 | 2 | 10 | 30 | 83 | 61 | 143 | 2 | 4 | 5 |
3 | 127 | 79 | 24 | 205 | 59 | 5 | 156 | 56 | 44 | 59 | 1 | 4 | 56 | 0 | 17 | 27 | 83 | 59 | 136 | 0 | 3 | 5 |
4 | 164 | 124 | 30 | 253 | 87 | 4 | 260 | 68 | 65 | 63 | 2 | 17 | 74 | 1 | 15 | 43 | 124 | 63 | 173 | 1 | 16 | 7 |
5 | 117 | 68 | 10 | 154 | 85 | 3 | 187 | 37 | 26 | 43 | 4 | 13 | 44 | 1 | 18 | 43 | 81 | 43 | 136 | 1 | 13 | 4 |
6 | 162 | 94 | 35 | 243 | 105 | 10 | 316 | 65 | 57 | 61 | 1 | 31 | 51 | 0 | 21 | 59 | 111 | 61 | 185 | 0 | 16 | 9 |
7 | 95 | 72 | 28 | 130 | 59 | 7 | 231 | 41 | 33 | 52 | 4 | 27 | 32 | 0 | 24 | 27 | 87 | 52 | 95 | 0 | 17 | 8 |
8 | 78 | 72 | 23 | 137 | 80 | 17 | 347 | 51 | 34 | 46 | 9 | 27 | 23 | 1 | 26 | 28 | 113 | 46 | 105 | 1 | 19 | 8 |
9 | 74 | 59 | 18 | 88 | 59 | 17 | 235 | 24 | 26 | 31 | 2 | 31 | 19 | 0 | 23 | 32 | 68 | 31 | 85 | 0 | 21 | 10 |
10 | 71 | 73 | 35 | 85 | 67 | 26 | 291 | 49 | 40 | 43 | 4 | 37 | 17 | 1 | 19 | 30 | 75 | 43 | 71 | 1 | 32 | 16 |
1 | # 6) 인기도와 예산 관계는? |
budget_cat | (-0.001, 7500000.0] | (7500000.0, 22000000.0] | (22000000.0, 50000000.0] | (50000000.0, 380000000.0] |
---|---|---|---|---|
cat_voteav | ||||
1 | 508 | 174 | 177 | 138 |
2 | 352 | 212 | 250 | 234 |
3 | 345 | 217 | 249 | 214 |
4 | 375 | 202 | 188 | 157 |
5 | 341 | 180 | 139 | 147 |
1 | # 7) 영화 run time과 인기도 사이에 관계가 있을까? |
cat_voteav | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
cat_runtime | |||||
1 | 389 | 229 | 156 | 111 | 87 |
2 | 278 | 241 | 202 | 158 | 87 |
3 | 195 | 248 | 240 | 169 | 123 |
4 | 90 | 217 | 238 | 245 | 167 |
5 | 45 | 113 | 189 | 239 | 343 |
1 | def genre_recommendations(df , mat, items): |
title | Avatar | score | total_score | |
---|---|---|---|---|
0 | Alien | 0.377950 | 7.9 | 10.252950 |
1 | Aliens | 0.427672 | 7.7 | 10.052672 |
2 | Star Trek Into Darkness | 0.420840 | 7.4 | 9.670840 |
3 | Gravity | 0.417716 | 7.3 | 9.542716 |
4 | Treasure Planet | 0.320138 | 7.2 | 9.320138 |
5 | Stargate: The Ark of Truth | 0.315874 | 6.9 | 8.940874 |
6 | Spaceballs | 0.329561 | 6.7 | 8.704561 |
7 | Starship Troopers | 0.298873 | 6.7 | 8.673873 |
8 | Silent Running | 0.410423 | 6.3 | 8.285423 |
9 | Space Dogs | 0.391469 | 6.3 | 8.266469 |
3번 문제
1 | # #3. dataset 데이터 분석 및 연관 규칙 생성 |
1 | #Date열을 datetime으로 변경 |
1 | # na.nan데이터 없음 |
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 38765 entries, 0 to 38764
Data columns (total 3 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Member_number 38765 non-null int64
1 Date 38765 non-null datetime64[ns]
2 itemDescription 38765 non-null object
dtypes: datetime64[ns](1), int64(1), object(1)
memory usage: 908.7+ KB
1 | # 2) 연도별 많이 / 적게 팔린 아이템은? |
year
2014 1038.0
2015 1464.0
dtype: float64
1 | a |
itemDescription | Instant food products | UHT-milk | abrasive cleaner | artif. sweetener | baby cosmetics | bags | baking powder | bathroom cleaner | beef | berries | beverages | bottled beer | bottled water | brandy | brown bread | butter | butter milk | cake bar | candles | candy | canned beer | canned fish | canned fruit | canned vegetables | cat food | ... | sparkling wine | specialty bar | specialty cheese | specialty chocolate | specialty fat | specialty vegetables | spices | spread cheese | sugar | sweet spreads | syrup | tea | tidbits | toilet cleaner | tropical fruit | turkey | vinegar | waffles | whipped/sour cream | whisky | white bread | white wine | whole milk | yogurt | zwieback |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
year | |||||||||||||||||||||||||||||||||||||||||||||||||||
2014 | 37.0 | 160.0 | 12.0 | 13.0 | 1.0 | 1.0 | 67.0 | 11.0 | 177.0 | 128.0 | 109.0 | 319.0 | 504.0 | 17.0 | 323.0 | 273.0 | 126.0 | 54.0 | 34.0 | 136.0 | 266.0 | 70.0 | 8.0 | 49.0 | 109.0 | ... | 25.0 | 110.0 | 37.0 | 125.0 | 20.0 | 10.0 | 23.0 | 48.0 | 155.0 | 34.0 | 13.0 | 19.0 | 12.0 | 5.0 | 364.0 | 27.0 | 29.0 | 166.0 | 365.0 | 3.0 | 213.0 | 82.0 | 1038.0 | 640.0 | 24.0 |
2015 | 23.0 | 163.0 | 10.0 | 16.0 | 2.0 | 3.0 | 55.0 | 6.0 | 339.0 | 199.0 | 142.0 | 368.0 | 429.0 | 21.0 | 248.0 | 261.0 | 137.0 | 39.0 | 32.0 | 83.0 | 451.0 | 46.0 | 13.0 | 33.0 | 68.0 | ... | 21.0 | 100.0 | 35.0 | 115.0 | 9.0 | 1.0 | 17.0 | 52.0 | 110.0 | 35.0 | 8.0 | 8.0 | 10.0 | NaN | 668.0 | 53.0 | 22.0 | 114.0 | 297.0 | 5.0 | 149.0 | 94.0 | 1464.0 | 694.0 | 36.0 |
2 rows × 167 columns
1 | #3 |
year | weekday | month | |
---|---|---|---|
Member_number | |||
1000 | 26192 | 43 | 76 |
1001 | 24175 | 35 | 51 |
1002 | 16116 | 46 | 36 |
1003 | 16114 | 24 | 30 |
1004 | 42296 | 27 | 150 |
... | ... | ... | ... |
4996 | 20150 | 24 | 90 |
4997 | 12088 | 36 | 50 |
4998 | 4030 | 4 | 20 |
4999 | 32236 | 58 | 62 |
5000 | 14101 | 27 | 34 |
3898 rows × 3 columns
1 | g_df_dummies = pd.get_dummies(g_df,columns=['weekday','month']) |
1 | g_df_dummies.groupby('Member_number').sum(numeric_only=True).iloc[:,1:].sum() |
weekday_0 5382
weekday_1 5558
weekday_2 5562
weekday_3 5620
weekday_4 5562
weekday_5 5551
weekday_6 5530
month_1 3324
month_2 2997
month_3 3133
month_4 3260
month_5 3408
month_6 3264
month_7 3300
month_8 3496
month_9 3059
month_10 3261
month_11 3254
month_12 3009
dtype: int64
1 | # 4,5 |
1 | # 매출횟수 상위 100명에게 vip타이틀 부여 |
/var/folders/pr/27tft1vj6396wqnj02ngz5p80000gn/T/ipykernel_28378/1060107556.py:2: FutureWarning: The behavior of `series[i:j]` with an integer-dtype index is deprecated. In a future version, this will be treated as *label-based* indexing, consistent with e.g. `series[i]` lookups. To retain the old behavior, use `series.iloc[i:j]`. To get the future behavior, use `series.loc[i:j]`.
b = g_df.groupby('Member_number')['itemDescription'].size().sort_values(ascending=False)[:100]
1 | from mlxtend.preprocessing import TransactionEncoder |
/var/folders/pr/27tft1vj6396wqnj02ngz5p80000gn/T/ipykernel_28378/1298349058.py:6: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
vip['itemDescription'] = vip['itemDescription'].apply(lambda x : x+',')
/opt/homebrew/lib/python3.10/site-packages/mlxtend/frequent_patterns/fpcommon.py:111: DeprecationWarning: DataFrames with non-bool types result in worse computationalperformance and their support might be discontinued in the future.Please use a DataFrame with bool type
warnings.warn(
antecedents | consequents | antecedent support | consequent support | support | confidence | lift | leverage | conviction | |
---|---|---|---|---|---|---|---|---|---|
8 | yogurt,tropicalfruit,rolls/buns | othervegetables,cannedbeer,wholemilk | 0.11 | 0.21 | 0.09 | 0.82 | 3.9 | 0.0669 | 4.345000 |
14 | yogurt,tropicalfruit,rolls/buns | othervegetables,cannedbeer,wholemilk | 0.11 | 0.21 | 0.09 | 0.82 | 3.9 | 0.0669 | 4.345000 |
20 | yogurt,tropicalfruit,rolls/buns | othervegetables,cannedbeer,wholemilk | 0.11 | 0.21 | 0.09 | 0.82 | 3.9 | 0.0669 | 4.345000 |
9 | othervegetables,cannedbeer,wholemilk | yogurt,tropicalfruit,rolls/buns | 0.21 | 0.11 | 0.09 | 0.43 | 3.9 | 0.0669 | 1.557500 |
15 | othervegetables,cannedbeer,wholemilk | yogurt,tropicalfruit,rolls/buns | 0.21 | 0.11 | 0.09 | 0.43 | 3.9 | 0.0669 | 1.557500 |
21 | othervegetables,cannedbeer,wholemilk | yogurt,tropicalfruit,rolls/buns | 0.21 | 0.11 | 0.09 | 0.43 | 3.9 | 0.0669 | 1.557500 |
6 | yogurt,rolls/buns,tropicalfruit,wholemilk | othervegetables,cannedbeer | 0.10 | 0.25 | 0.09 | 0.90 | 3.6 | 0.0650 | 7.500000 |
12 | tropicalfruit,wholemilk,yogurt,rolls/buns | othervegetables,cannedbeer | 0.10 | 0.25 | 0.09 | 0.90 | 3.6 | 0.0650 | 7.500000 |
17 | yogurt,rolls/buns,tropicalfruit,wholemilk | othervegetables,cannedbeer | 0.10 | 0.25 | 0.09 | 0.90 | 3.6 | 0.0650 | 7.500000 |
11 | othervegetables,cannedbeer | yogurt,rolls/buns,tropicalfruit,wholemilk | 0.25 | 0.10 | 0.09 | 0.36 | 3.6 | 0.0650 | 1.406250 |
18 | othervegetables,cannedbeer | yogurt,rolls/buns,tropicalfruit,wholemilk | 0.25 | 0.10 | 0.09 | 0.36 | 3.6 | 0.0650 | 1.406250 |
23 | othervegetables,cannedbeer | tropicalfruit,wholemilk,yogurt,rolls/buns | 0.25 | 0.10 | 0.09 | 0.36 | 3.6 | 0.0650 | 1.406250 |
0 | yogurt,tropicalfruit,rolls/buns | othervegetables,cannedbeer | 0.11 | 0.25 | 0.09 | 0.82 | 3.3 | 0.0625 | 4.125000 |
2 | yogurt,tropicalfruit,rolls/buns | othervegetables,cannedbeer | 0.11 | 0.25 | 0.09 | 0.82 | 3.3 | 0.0625 | 4.125000 |
4 | yogurt,tropicalfruit,rolls/buns | othervegetables,cannedbeer | 0.11 | 0.25 | 0.09 | 0.82 | 3.3 | 0.0625 | 4.125000 |
1 | othervegetables,cannedbeer | yogurt,tropicalfruit,rolls/buns | 0.25 | 0.11 | 0.09 | 0.36 | 3.3 | 0.0625 | 1.390625 |
3 | othervegetables,cannedbeer | yogurt,tropicalfruit,rolls/buns | 0.25 | 0.11 | 0.09 | 0.36 | 3.3 | 0.0625 | 1.390625 |
5 | othervegetables,cannedbeer | yogurt,tropicalfruit,rolls/buns | 0.25 | 0.11 | 0.09 | 0.36 | 3.3 | 0.0625 | 1.390625 |
10 | yogurt,cannedbeer,othervegetables | rolls/buns,tropicalfruit,wholemilk | 0.13 | 0.22 | 0.09 | 0.69 | 3.1 | 0.0614 | 2.535000 |
16 | yogurt,cannedbeer,othervegetables | rolls/buns,tropicalfruit,wholemilk | 0.13 | 0.22 | 0.09 | 0.69 | 3.1 | 0.0614 | 2.535000 |
22 | yogurt,cannedbeer,othervegetables | rolls/buns,tropicalfruit,wholemilk | 0.13 | 0.22 | 0.09 | 0.69 | 3.1 | 0.0614 | 2.535000 |
7 | rolls/buns,tropicalfruit,wholemilk | yogurt,cannedbeer,othervegetables | 0.22 | 0.13 | 0.09 | 0.41 | 3.1 | 0.0614 | 1.472308 |
13 | rolls/buns,tropicalfruit,wholemilk | yogurt,cannedbeer,othervegetables | 0.22 | 0.13 | 0.09 | 0.41 | 3.1 | 0.0614 | 1.472308 |
19 | rolls/buns,tropicalfruit,wholemilk | yogurt,cannedbeer,othervegetables | 0.22 | 0.13 | 0.09 | 0.41 | 3.1 | 0.0614 | 1.472308 |
4번 문제
1 | # 4. 와인의 화학 조성을 사용하여 와인의 종류를 예측하기 위한 데이터이다. load_wine() 명령으로 로드하며 다음과 같이 구성되어 있다. |