import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# reading the csv data set
dataset = pd.read_csv("tips.csv")
print(dataset.info())
print(dataset.head())
dataset.describe()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 244 entries, 0 to 243 Data columns (total 7 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 total_bill 244 non-null float64 1 tip 244 non-null float64 2 sex 244 non-null object 3 smoker 244 non-null object 4 day 244 non-null object 5 time 244 non-null object 6 size 244 non-null int64 dtypes: float64(2), int64(1), object(4) memory usage: 13.5+ KB None total_bill tip sex smoker day time size 0 16.99 1.01 Female No Sun Dinner 2 1 10.34 1.66 Male No Sun Dinner 3 2 21.01 3.50 Male No Sun Dinner 3 3 23.68 3.31 Male No Sun Dinner 2 4 24.59 3.61 Female No Sun Dinner 4
total_bill | tip | size | |
---|---|---|---|
count | 244.000000 | 244.000000 | 244.000000 |
mean | 19.785943 | 2.998279 | 2.569672 |
std | 8.902412 | 1.383638 | 0.951100 |
min | 3.070000 | 1.000000 | 1.000000 |
25% | 13.347500 | 2.000000 | 2.000000 |
50% | 17.795000 | 2.900000 | 2.000000 |
75% | 24.127500 | 3.562500 | 3.000000 |
max | 50.810000 | 10.000000 | 6.000000 |
dataset.shape
(244, 7)
# Plotting bar plot of total_bill vs tip
plt.bar(dataset['sex'], dataset['tip'].count())
# Giving our plot a title
plt.title("Bar Chart")
# GIving x and y labels names
plt.xlabel('Total Bill')
plt.ylabel('Tip')
plt.show()
plt.plot(dataset['tip'])
# GIving x and y labels names
plt.xlabel('Instance')
plt.ylabel('Tip')
plt.show()
sns.lineplot(x='total_bill', y='tip', data=dataset)
<Axes: xlabel='total_bill', ylabel='tip'>
# Plotting Scatter plot of total_bill vs tip
plt.scatter(dataset['total_bill'], dataset['tip'])
# Giving our plot a title
plt.title("This is Scatter Plot")
# GIving x and y labels names
plt.xlabel('Total_bill')
plt.ylabel('Tip')
plt.show()
# Plotting histogram plot of total_bill vs tip
plt.hist(dataset['tip'])
# Giving our plot a title
plt.title("This is Histogram Plot")
# GIving x and y labels names
plt.xlabel('Tip')
plt.ylabel('Frequency')
plt.show()
What is Box Plot? A Box plot is a way to visualize the distribution of the data by using a box and some vertical lines. It is known as the whisker plot. The data can be distributed between five key ranges, which are as follows:
Minimum: Q1-1.5IQR 1st quartile (Q1): 25th percentile Median:50th percentile 3rd quartile(Q3):75th percentile Maximum: Q3+1.5IQR Here IQR represents the InterQuartile Range which starts from the first quartile (Q1) and ends at the third quartile (Q3).
figure = plt.figure(figsize =(10, 8))
plt.boxplot(dataset["tip"])
plt.show()
# We can look at an individual feature in Seaborn through a boxplot
sns.boxplot(x="sex", y="tip", data=dataset)
<Axes: xlabel='sex', ylabel='tip'>
Bubble CHART
# use the scatterplot function to build the bubble map
plt.scatter(x=dataset["sex"], y = dataset["total_bill"],
s=20*10)
plt.show()
plt.legend()
plt.scatter(x=dataset["total_bill"], y=dataset["time"],
s=300, c="blue", alpha=0.4, linewidth=3)
plt.ylabel("No. of dining")
plt.figure(figsize=(10, 4))
No artists with labels found to put in legend. Note that artists whose label start with an underscore are ignored when legend() is called with no argument.
<Figure size 1000x400 with 0 Axes>
<Figure size 1000x400 with 0 Axes>
import plotly.graph_objects as px
import numpy as np
# creating random data through randomint
# function of numpy.random
np.random.seed(42)
random_x= np.random.randint(1,101,100)
random_y= np.random.randint(1,101,100)
plot = px.Figure(data=[px.Scatter(
x = random_x,
y = random_y,
mode = 'markers',
marker_size = [115, 20, 30])
])
plot.show()
import plotly.graph_objects as px
import numpy as np
plot = px.Figure(data=[px.Scatter(
x = dataset["tip"],
y = dataset["total_bill"],
mode = 'markers',
marker_size = [120, 50, 20, 30])
])
plot.show()
import plotly.graph_objects as go
from numpy import random
size = np.random.randint(1,100, size=(50))
print(size)
fig = go.Figure(data=[go.Scatter(
x=dataset["tip"],
y=dataset["total_bill"],
mode='markers',
marker=dict(
size=size,
sizemode='area',
sizeref=2.*max(size)/(40.**2),
sizemin=4
)
)])
fig.show()
[79 59 32 96 88 52 62 58 52 12 39 2 3 56 81 59 2 2 92 54 87 96 97 1 19 2 53 44 90 32 70 32 68 55 75 56 17 38 24 69 98 70 86 11 16 97 73 59 70 80]
pip install wordcloud
Defaulting to user installation because normal site-packages is not writeable Collecting wordcloud Obtaining dependency information for wordcloud from https://files.pythonhosted.org/packages/f5/b0/247159f61c5d5d6647171bef84430b7efad4db504f0229674024f3a4f7f2/wordcloud-1.9.3-cp311-cp311-win_amd64.whl.metadata Downloading wordcloud-1.9.3-cp311-cp311-win_amd64.whl.metadata (3.5 kB) Requirement already satisfied: numpy>=1.6.1 in c:\programdata\anaconda3\lib\site-packages (from wordcloud) (1.24.3) Requirement already satisfied: pillow in c:\programdata\anaconda3\lib\site-packages (from wordcloud) (9.4.0) Requirement already satisfied: matplotlib in c:\programdata\anaconda3\lib\site-packages (from wordcloud) (3.7.2) Requirement already satisfied: contourpy>=1.0.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (1.0.5) Requirement already satisfied: cycler>=0.10 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (0.11.0) Requirement already satisfied: fonttools>=4.22.0 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (4.25.0) Requirement already satisfied: kiwisolver>=1.0.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (1.4.4) Requirement already satisfied: packaging>=20.0 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (23.1) Requirement already satisfied: pyparsing<3.1,>=2.3.1 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (3.0.9) Requirement already satisfied: python-dateutil>=2.7 in c:\programdata\anaconda3\lib\site-packages (from matplotlib->wordcloud) (2.8.2) Requirement already satisfied: six>=1.5 in c:\programdata\anaconda3\lib\site-packages (from python-dateutil>=2.7->matplotlib->wordcloud) (1.16.0) Downloading wordcloud-1.9.3-cp311-cp311-win_amd64.whl (300 kB) ---------------------------------------- 0.0/300.2 kB ? eta -:--:-- - -------------------------------------- 10.2/300.2 kB ? eta -:--:-- - -------------------------------------- 10.2/300.2 kB ? eta -:--:-- - -------------------------------------- 10.2/300.2 kB ? eta -:--:-- ----- --------------------------------- 41.0/300.2 kB 279.3 kB/s eta 0:00:01 --------- ----------------------------- 71.7/300.2 kB 391.3 kB/s eta 0:00:01 -------------- ----------------------- 112.6/300.2 kB 544.7 kB/s eta 0:00:01 ------------------ ------------------- 143.4/300.2 kB 532.5 kB/s eta 0:00:01 ------------------- ------------------ 153.6/300.2 kB 538.9 kB/s eta 0:00:01 ------------------------ ------------- 194.6/300.2 kB 588.9 kB/s eta 0:00:01 ----------------------------- -------- 235.5/300.2 kB 600.7 kB/s eta 0:00:01 -------------------------------- ----- 256.0/300.2 kB 628.5 kB/s eta 0:00:01 ------------------------------------- 297.0/300.2 kB 654.6 kB/s eta 0:00:01 -------------------------------------- 300.2/300.2 kB 579.8 kB/s eta 0:00:00 Installing collected packages: wordcloud Successfully installed wordcloud-1.9.3 Note: you may need to restart the kernel to use updated packages.
WARNING: The script wordcloud_cli.exe is installed in 'C:\Users\Ummesalma\AppData\Roaming\Python\Python311\Scripts' which is not on PATH. Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
from wordcloud import WordCloud
# Start with one review:
text = "India is my Nation, I love India, I Love my work, I Love myself and my family and my friends. "
# Create and generate a word cloud image:
wordcloud = WordCloud().generate(text)
# Display the generated image:
plt.imshow(wordcloud, interpolation='bilinear')
plt.axis("off")
plt.show()