How AI Powers SakeSaySo's Language Learning
In our ongoing journey with SakeSaySo, we’ve been exploring ways to automate and enrich the learning experience by experimenting and integrating various AI tools. This project started as a modest side endeavor where we could toy with ideas and tools, particularly in automating content creation and embracing self-hosting, privacy and open-source principles. Most of these ideas can be traced back to Tokyo Izakaya sessions, where the only things flowing more freely than the drinks are ideas about changing language learning.
While the app itself, the logo and most of the translations and content was developed with tools like ChatGPT, DALL·E 3 (that’s right, the logo was created in seconds and just converted to SVG), Github Copilot, Claude.ai and other AI tools, one of the challenges was keeping content fresh and engaging without dedicating extensive resources to manual updates. This is a minor side project after all. To address this, we experimented with leveraging large language models (LLMs) to automate the generation of daily news summaries and social media content.
Automated News Summaries (in-app)
We developed a system that aggregates news headlines from various sources via RSS feeds. The process involves:
- Fetching Headlines: The system retrieves recent articles on topics like general news, science, economy, and sports, filtering out any content older than a week.
- Generating Summaries: Using LLMs, we generate concise summaries and translations of these articles. The models help in producing natural-sounding language that is suitable for learners.
- Formatting for Learning: The summaries are formatted to highlight key vocabulary and phrases, providing an educational context that aids in language acquisition.
Automated Social Media Posts
We also looked into automating our social media presence to offer additional touchpoints for engagement. Here’s how we approached it:
- Creating Tweets: The system generates short, tweets in Japanese with a light-hearted theme. The idea is to craft content that is culturally relevant and entertaining, without touching sensitive topics or being cringe. This was a challenge of prompt engineering in itself.
- Multi-Step Refinement: Each generated tweet undergoes several stages of refinement:
- Content Evaluation: We check for appropriateness, length (ensuring it fits within platform character limits), and educational value.
- Language Quality: The text is reviewed for naturalness and clarity, important factors in language learning.
- Translation and Annotation: English translations and brief explanations are added to aid understanding.
- Automated Posting: Auto approved content is scheduled and posted automatically, maintaining a consistent online presence without manual intervention.
Content won’t blow you away and doesn’t always make perfect sense, but it’s a fun way to engage with the language and culture.
A generated tweet, including Japanese and its English translation, looks like this:
円相場が安定(あんてい)して、インフレイザーさんのツイートも減ったけど、BARではまだまだ盛り上がっているみたい。
The Bitcoin price has stabilized, and even the influencers' tweets have slowed down, but the bars are still lively. 🍺🎉
円相場が安定(あんてい)して、インフレイザーさんのツイートも減ったけど、BARではまだまだ盛り上がっているみたい。
— SakeSaySo 🍶🇯🇵🇺🇸 (@sakesayso) September 28, 2024
The Bitcoin price has stabilized, and even the influencers' tweets have slowed down, but the bars are still lively. 🍺🎉
We keep refining the prompts and workflow steps as we go and see the results.
Data Pipelines & Workflow Tools
To manage these automated processes, we found existing solutions either too complex or not tailored to our specific needs. As a result, we developed custom tools to orchestrate the workflows involved in content generation and distribution. Using Kyodo Tech’s Orchid framework (on GitHub), a lightweight workflow orchestration framework built in Go, we handle multi-step workflows efficiently. Key features include:
- Directed Graph Structure: Workflows are defined as directed acyclic graphs (DAGs), where each node represents a task, and edges define the execution order based on dependencies.
- Data Passing Between Tasks: Tasks communicate by passing data in a standardized format, allowing for modularity and scalability.
- Error Handling and Retries: Orchid includes robust error handling, with customizable retry policies for tasks that may fail due to transient issues.
- Dynamic Routing: Based on runtime data and conditions, the workflow can branch dynamically, allowing flexible execution paths.
- Context Management: It leverages Go’s context package to manage cancellation signals and pass metadata throughout the workflow.
This orchestration system, powered by the Orchid framework, enabled us to automate complex sequences, such as fetching data, processing it through AI models, refining the outputs, and handling the final distribution—all without manual oversight.
Infrastructure Automation
For deployment and infrastructure management, we created a process on top of Pulumi, that automates the provisioning of infrastructure and Kubernetes environments. Our focus is on:
- Infrastructure as Code: Using Go and a set of libraries, we define and manage cloud resources programmatically, allowing for repeatable and version-controlled deployments.
- Integration with CI/CD Pipelines: Continuous integration and delivery systems, automating the deployment process from code commit to production.
- Secret Management: Securely handling sensitive information, like API keys and tokens, within the deployment pipeline.
Automated infrastructure provisioning enables quickly spinning up environments for testing and production, ensuring that our services are consistent and reliable.
Embracing Self-Hosting and Privacy
A core principle in our project is the emphasis on user privacy and control. We operate without tracking or advertising.
- No Third-Party Tracking: The app does not include any third-party analytics or tracking services. User activity remains private and is not shared or sold.
- Offline Functionality: All features of SakeSaySo are available offline. Users can download content and continue their learning without needing a constant internet connection.
- Data Ownership: Users have full control over their data. Vocabulary decks and learning progress are stored locally, and users can export or delete their data at any time.
- Open Communication: We’re transparent about how the app operates and are committed to open-sourcing as much of our work as possible. This allows the community to inspect, contribute to, and modify the code to suit their needs.
Self-hosting avoids dependencies and saves money.
Our experimentation with AI and automation has been a learning experience. By building custom tools and integrating advanced technologies, we’ve been able to automate much of the work with minimal costs. While our approach is one of many possible paths, our focus has been on practicality and learning, rather than striving for perfection. SakeSaySo remains a side project, offering a platform for experimenting and sharing ideas. We look forward to open-sourcing our tools soon and welcome collaboration.
Thank you for taking the time to read about our progress and reflections.