python - Recode list values in dataframe column -
i'm trying recode values in dataframe column organized in list format. know how replace string values in dataframe column struggling how in list.
here snippet of data:
{0: '[crime, drama]', 1: '[crime, drama]', 2: '[crime, drama]', 3: '[action, crime, drama, thriller]', 4: '[crime, drama]', 5: '[biography, drama, history]', 6: '[crime, drama]', 7: '[adventure, drama, fantasy]', 8: '[western]', 9: '[drama]'}
for example, i'd recode crimes thrillers , biography history.
i know below works replacing string values
df.loc[df['genre']=='crime']='thriller'
but how modify list?
thanks!
edit: code used create dataframe (with data extracted imdb database) is:
# these variables want (ie able to) extract movie object metadata = ('title', 'rating', 'genre', "plot", "language", "runtime", "year", "color", "country" , "votes") #creates dataframe variable name headers df = pd.dataframe(np.random.randn(250, len(metadata)), columns=metadata) #these different data types, including lists, makes compile df = df.astype('object') #populate df movie objects in range(250): j in metadata: df.loc[i, j] = movies_list[i].get(j) # convert right data types: metadata_dict_dtypes = {"title": unicode, "rating": float, "genre":list, "plot": str, "language":list, "runtime":list, "year":int, "color":list, "country":list , "votes":int} colname, my_dtype in metadata_dict_dtypes.iteritems(): df[colname] = df[colname].astype(my_dtype)
assuming correctly formatted list in dataframe. write function takes row, , genre name change map arguments , apply dataframe. example
name_map = {'crime': 'thriller', 'biography': 'history'} def change_names(row, name_map): name in name_map: if name in row.genre: row.genre[row.genre.index(name)] = name_map[name] return row df = df.apply(lambda row: change_name(row, name_map), axis=1)
it's not vectorized, job done.
Comments
Post a Comment