Tableau Integration with Python – Latest

Despite the fact that Tableau can offer you statistical features like forecasting.In order to advance your study, you must link Tableau with Python or R because those features alone do not address the fundamentals of data science.

Python is a well-liked general-purpose programming language that is used extensively in both industry and academics.It offers a wide range of machine learning and statistical techniques and is very expandable.

Do you wish to extract a report with a list of Tableau Data-sources that are not accessible or are you interested in sentiment analysis using Tableau Desktop? then you’ve arrived to the right place.

TabPy (the Tableau Python Server) is an Analytics Extension implementation that expands Tableau’s capabilities by allowing users to execute Python scripts and saved functions via Tableau’s table calculations.

The lethal combo of Python and Tableau can meet the data analysis needs of any enterprise.R scripts can now be called in calculated fields using R Server [Rserve()] thanks to a feature added by Tableau in 2013 called the R Integration [since Tableau 8.1]. With the introduction of Tableau 10.1, Python is now available.

Please watch the video for a complete, step-by-step explanation of Tableau Python Integration. Numerous blogs may contain a lot of instructions and directives. I’m merely attempting to provide you with the quickest path to a productive Tableau Python integration.

First things first, gentlemen, let’s head over to Anaconda.com and install the same, so let’s do that. Download Anaconda now for the simplest way to use Python, data science, and machine learning on a single computer.

 

1.Install Anaconda (visit www.anaconda.com)

2.You can see the anaconda3 folder within your directory

3.Activate Base (base) 

    C:\Users\RiteshBisht\anaconda3\Scripts>activate base

4. python -m pip install –upgrade pip

5. python -m pip install tableauserverclient

6. pip install tabpy 

7. pip install vaderSentiment (for Sentiment Analysis later)

8. Run C:\Users\RiteshBisht\anaconda3\Scripts>tabpy.exe

 

After completing the aforementioned procedures, open the tableau worksheet to establish a connection with the Tab Py server.

9. Select “Help” > “Setting and Performance” > “Manage Analytics Extension Connection”.

10.Insert Hostname as “localhost” and port as 9004

11.Click on “Test Connection” & ensure that connection is established as shown below.

“Successfully connected to the analytics extensions

You can create a scatter plot using Tableau fields Segment > ColumnsSales > Columns and Profit > Rows, you can replicate the RHS view.

So, if you look at my Tableau workbook, I have a calculated field called “Python-Corr.” I’m using script, and I just want to use the Python statistical package to see the coefficient of correlation.

Open a Calculated field as “Python-Corr” and copy the below code

SCRIPT_REAL(“import numpy as np

return np.corrcoef(_arg1,_arg2)[0,1]”,sum([Sales]),

sum([Profit]))

Sales and Profit are acting as _arg1 and _arg2 respectively and we expect to calculate the coefficient of correlation.

Drag the calculation “Python-Corr”  to “Color” Marks & ensure you are Computing using Customer ID

Just hover on any Customer ID and you will see the coefficient of correlation between Sales and Profit for that customer (can be seen below)

Follow the Tableau Python Video (above) &  use the formula below to determine the relationship between sales and profit.

 As an added bonus, I’d want to provide you with a script that will retrieve a list of data sources. This gives you countless options to modify the script and address common problems.( getting list of unused data sources since 3 months etc)

import tableauserverclient as TSC
tableau_auth = TSC.TableauAuth("username@domain.com", "password", "sitename")
server = TSC.Server("https://yourservernameonline.tableau.com/")
with server.auth.sign_in(tableau_auth):
all_datasources, pagination_item = server.datasources.get()
print("\nThere are {} datasources on site: ".format(pagination_item.total_available))
print([datasource.name for datasource in all_datasources])

Links that are helpful for discussing Tableau Python integration

1. Installation of Anaconda at https://www.anaconda.com

2. TabPy documentations are available at https://developer.tableau.com/tools/python-integration-tabpy

3. Link to Tableau on GitHub: https://tableau.github.io/TabPy

4.Community Post on Tableau-Python Integration 

I hope you like this  post was useful for you and it will help you to kick-start your journey in the data science world of Tableau

Ritesh Bisht

#1 in the world to be Tableau Ambassador and Power BI Super User