A Practical Use-Case of Cloud-Native and Secured Dashboard with Google Cloud and Python Streamlit
Introduction
With rising cloud services and data scientist-friendly visualization tools, building a dashboard is getting easier and faster.
However, it’s also becoming more and more complicated to understand or utilize them.
This article will show the use-case of combining these technologies by building a secured dashboard managing my investment portfolio.
This article explains the application from three perspectives; business, data science, and engineering. These are often defined as essential skills in data science. Therefore, I intend to break down my explanation into these sections so you can read them in which you’re interested.
Business Persipective: Requierments
Though this article focuses on technology, it wouldn’t be convincing if my app is not unpractical(even if this is only for personal use).
Therefore, I will define some requirements before the implementation.
By the way, I’ve bought some ETFs monthly, but I’m not sure what’s going on in my portfolio. This is because the prices of ETFs are varied and go up and down day by day. In addition, I don’t check my portfolio frequently because I don’t want to spend much time watching the stock markets. These things remind me of creating an app satisfying the following requirements.
1. show specific ETFs I’m interested in to see whether each stock is a bargain or not
2. show the current value of my portfolio to check how good or bad
3. show the ratio of the types of ETFs (e.g., stock/bond/commodity) to help me to decide whether I need to rotate my portfolio according to the best ratio of the types of assets
4. update daily because I’ll check this daily at most
5. authentication is required to hide my tangible assets(This is IMPORTANT)!
In addition to the above requirements, UI should be handy but provide sufficient information. Just between iPhone’s stock app and TradingView is ideal for me.
Data Science Perspective: Data Modeling and Build Data Pipeline
I need to prepare a data mart for my dashboard to meet the above requirements. The data mart is one of the concepts in the data model, and this also includes the data warehouse and the data lake. These concepts have different purposes so let me explain them in the following table.
There are two types of visualization needed, so I will create two data marts and a data warehouse that can provide enough data for data marts.
Now let’s get into the data schema of data marts. The first data mart is to plot a line chart of my portfolio and stocks, so historical values need to be prepared. The second one is to plot a pie chart of the ratio of my portfolio, so each stock’s ratio needs to be calculated.
Calculate Portfolio Value
Calculate Portfolio Ratio
The data modeling in detail is omitted due to space limitations. In the next article, I will introduce Google BigQuery and dbt in this data pipeline to explain modeling.
Ref: https://koyaaarr.medium.com/dbt-and-bigquery-in-practice-transform-stock-data-1771e2393319
Engineering Perspective: Architecture and Software
Finally, select appropriate software and services and combine them to realize my system. Here is the whole architecture.
Let me explain each component for each role.
Data Retrieve, Transform, Accumulate Script
- Cloud Function: for data retrieving, transforming, and accumulating
- Cloud Storage: data will be served from here via API
- pandas-datareader: to get stock data
- gcsfs: to get data from Cloud Storage
Data Visualize Application
- Cloud Run: run application containerized with Docker.
- Cloud IAP(Identity-Aware Proxy): add authentication to Cloud Run app without coding
- Streamlit: serve a dashboard quickly and nicely
- Plotly: plot graphs quickly and nicely
Operation, CI/CD
- Cloud Build: Connect with GitHub to automatically and immediately deploy to Cloud Run / Cloud Functions after git push
- Cloud Scheduler: trigger Cloud Function regularly
- Cloud Pub/Sub: the same purpose with scheduler
How to use it regularly
I use the YAML file to simplify the operation of managing my portfolio. It is to configure my portfolio that contains the number of stocks I have and the details of each stock.
All I need to do is to modify the number of stocks I hold in this YAML file when I buy some stocks. After git push, Cloud build detects that and copies the YAML file to Cloud Storage automatically, then Cloud Function calculates according to the data so Cloud Run can fetch the latest data from there.
Lastly, what my dashboard looks like is this.
Conclusion
I intend to break down my application into three perspectives. There are a lot of valuable services like Cloud Run and Cloud IAP. These look complicated to use but are quite helpful in building an application quickly, so I strongly recommend diving into there. This article explains how to create the dashboard using Google Cloud and Python Streamlit. I hope you will find this helpful.
Reference
- Data Science Venn diagram (http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram)
- Trading View (https://www.tradingview.com/)
- iPhone Stocks app (https://apps.apple.com/us/app/stocks/id1069512882)
- Enabling IAP with Cloud Run (https://cloud.google.com/iap/docs/enabling-cloud-run)