用matplotlib如何画分析图表,代码是什么
Admin 2022-08-02 群英技术资讯 336 次浏览
作为一名优秀的分析师,还是得学会一些让图表漂亮的技巧,这样子拿出去才更加有面子哈哈。好了,今天的锦囊就是介绍一下各种常见的图表,可以怎么来画吧。
首先引入数据集,我们还用一样的数据集吧,分别是 Salary_Ranges_by_Job_Classification
以及 GlobalLandTemperaturesByCity
。(具体数据集可以后台回复 plot
获取)
# 导入一些常用包 import pandas as pd import numpy as np import seaborn as sns %matplotlib inline import matplotlib.pyplot as plt import matplotlib as mpl plt.style.use('fivethirtyeight') #解决中文显示问题,Mac from matplotlib.font_manager import FontProperties # 查看本机plt的有效style print(plt.style.available) # 根据本机available的style,选择其中一个,因为之前知道ggplot很好看,所以我选择了它 mpl.style.use(['ggplot']) # ['_classic_test', 'bmh', 'classic', 'dark_background', 'fast', 'fivethirtyeight', 'ggplot', 'grayscale', 'seaborn-bright', 'seaborn-colorblind', 'seaborn-dark-palette', 'seaborn-dark', 'seaborn-darkgrid', 'seaborn-deep', 'seaborn-muted', 'seaborn-notebook', 'seaborn-paper', 'seaborn-pastel', 'seaborn-poster', 'seaborn-talk', 'seaborn-ticks', 'seaborn-white', 'seaborn-whitegrid', 'seaborn', 'Solarize_Light2'] # 数据集导入 # 引入第 1 个数据集 Salary_Ranges_by_Job_Classification salary_ranges = pd.read_csv('./data/Salary_Ranges_by_Job_Classification.csv') # 引入第 2 个数据集 GlobalLandTemperaturesByCity climate = pd.read_csv('./data/GlobalLandTemperaturesByCity.csv') # 移除缺失值 climate.dropna(axis=0, inplace=True) # 只看中国 # 日期转换, 将dt 转换为日期,取年份, 注意map的用法 climate['dt'] = pd.to_datetime(climate['dt']) climate['year'] = climate['dt'].map(lambda value: value.year) climate_sub_china = climate.loc[climate['Country'] == 'China'] climate_sub_china['Century'] = climate_sub_china['year'].map(lambda x:int(x/100 +1)) climate.head()
折线图是比较简单的图表了,也没有什么好优化的,颜色看起来顺眼就好了。下面是从网上找到了颜色表,可以从中挑选~
# 选择上海部分天气数据 df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .set_index('dt') df1.head()
# 折线图 df1.plot(colors=['lime']) plt.title('AverageTemperature Of ShangHai') plt.ylabel('Number of immigrants') plt.xlabel('Years') plt.show()
上面这是单条折线图,多条折线图也是可以画的,只需要多增加几列。
# 多条折线图 df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SH'}) df2 = climate.loc[(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'TJ'}) df3 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SY'}) # 合并 df123 = df1.merge(df2, how='inner', on=['dt'])\ .merge(df3, how='inner', on=['dt'])\ .set_index(['dt']) df123.head()
# 多条折线图 df123.plot() plt.title('AverageTemperature Of 3 City') plt.ylabel('Number of immigrants') plt.xlabel('Years') plt.show()
接下来是画饼图,我们可以优化的点多了一些,比如说从饼块的分离程度,我们先画一个“低配版”的饼图。
df1 = salary_ranges.groupby('SetID', axis=0).sum()
# “低配版”饼图 df1['Step'].plot(kind='pie', figsize=(7,7), autopct='%1.1f%%', shadow=True) plt.axis('equal') plt.show()
# “高配版”饼图 colors = ['lightgreen', 'lightblue'] #控制饼图颜色 ['lightgreen', 'lightblue', 'pink', 'purple', 'grey', 'gold'] explode=[0, 0.2] #控制饼图分离状态,越大越分离 df1['Step'].plot(kind='pie', figsize=(7, 7), autopct = '%1.1f%%', startangle=90, shadow=True, labels=None, pctdistance=1.12, colors=colors, explode = explode) plt.axis('equal') plt.legend(labels=df1.index, loc='upper right', fontsize=14) plt.show()
散点图可以优化的地方比较少了,ggplot2的配色都蛮好看的,正所谓style选的好,省很多功夫!
# 选择上海部分天气数据 df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SH'}) df2 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SY'}) # 合并 df12 = df1.merge(df2, how='inner', on=['dt']) df12.head()
# 散点图 df12.plot(kind='scatter', x='SH', y='SY', figsize=(10, 6), color='darkred') plt.title('Average Temperature Between ShangHai - ShenYang') plt.xlabel('ShangHai') plt.ylabel('ShenYang') plt.show()
# 多条折线图 df1 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SH'}) df2 = climate.loc[(climate['Country']=='China')&(climate['City']=='Tianjin')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'TJ'}) df3 = climate.loc[(climate['Country']=='China')&(climate['City']=='Shenyang')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .rename(columns={'AverageTemperature':'SY'}) # 合并 df123 = df1.merge(df2, how='inner', on=['dt'])\ .merge(df3, how='inner', on=['dt'])\ .set_index(['dt']) df123.head()
colors = ['red', 'pink', 'blue'] #控制饼图颜色 ['lightgreen', 'lightblue', 'pink', 'purple', 'grey', 'gold'] df123.plot(kind='area', stacked=False, figsize=(20, 10), colors=colors) plt.title('AverageTemperature Of 3 City') plt.ylabel('AverageTemperature') plt.xlabel('Years') plt.show()
# 选择上海部分天气数据 df = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .set_index('dt') df.head()
# 最简单的直方图 df['AverageTemperature'].plot(kind='hist', figsize=(8,5), colors=['grey']) plt.title('ShangHai AverageTemperature Of 2010-2013') # add a title to the histogram plt.ylabel('Number of month') # add y-label plt.xlabel('AverageTemperature') # add x-label plt.show()
# 选择上海部分天气数据 df = climate.loc[(climate['Country']=='China')&(climate['City']=='Shanghai')&(climate['dt']>='2010-01-01')]\ .loc[:,['dt','AverageTemperature']]\ .set_index('dt') df.head()
df.plot(kind='bar', figsize = (10, 6)) plt.xlabel('Month') plt.ylabel('AverageTemperature') plt.title('AverageTemperature of shanghai') plt.show()
df.plot(kind='barh', figsize=(12, 16), color='steelblue') plt.xlabel('AverageTemperature') plt.ylabel('Month') plt.title('AverageTemperature of shanghai') plt.show()
免责声明:本站发布的内容(图片、视频和文字)以原创、转载和分享为主,文章观点不代表本网站立场,如果涉及侵权请联系站长邮箱:mmqy2019@163.com进行举报,并提供相关证据,查实之后,将立刻删除涉嫌侵权内容。
猜你喜欢
这篇文章主要介绍了Python使用ClickHouse的实践与踩坑记录,具有很好的参考价值,希望对大家有所帮助。如有错误或未考虑完全的地方,望不吝赐教
这篇文章主要介绍了利用Python绘制多种风玫瑰图,风玫瑰是由气象学家用于给出如何风速和风向在特定位置通常分布的简明视图的图形工具,下文绘制实现详情,需要的小伙伴可以参考一下
在日常的工作,经常需要获取时间等相关信息,下面这篇文章主要给大家介绍了关于如何用python从日期中获取年、月、日和星期等30种信息的相关资料,需要的朋友可以参考下
这篇文章主要介绍了pycharm无法安装cv2模块问题及解决方案,具有很好的参考价值,希望对大家有所帮助。如有错误或未考虑完全的地方,望不吝赐教
asyncio 在单线程内部维护了 EventLoop 队列,然后把需要执行异步IO的任务添加到 EventLoop 队列中,至于任务的完成通过类似回调的逻辑是实现后续的任务。如果你有 JavaScript的基础那么理解python的 asyncio 很简单,关键字、语法以及实现的原理都极其类似。
成为群英会员,开启智能安全云计算之旅
立即注册Copyright © QY Network Company Ltd. All Rights Reserved. 2003-2020 群英 版权所有
增值电信经营许可证 : B1.B2-20140078 粤ICP备09006778号 域名注册商资质 粤 D3.1-20240008