TIL: Hanzi related
Render Hanzi writer character data, stroke animations or stroke quizzes. Hanzi Writer
skishore/makemeahanzi: Free, open-source Chinese character data
HSK levels related
Render Hanzi writer character data, stroke animations or stroke quizzes. Hanzi Writer
skishore/makemeahanzi: Free, open-source Chinese character data
HSK levels related
I was attracted to the word loading indicator when Claude Code is working on a response. For example, jiving, whirring, channelling, conjuring, pontificating, concocting, discombobulating…
After searching from the web, I found this topic Claude Code’s loading messages which resolved my curiousity.
…found out that Claude Code will actually call the API and ask Claude Haiku to generate this single word based on your input. It does this as you type so it’s ready to go when you submit (and it will make many calls for each prompt).
For those interested, here’s the full system message for the request. It will send you input in a separate user message.
Analyze this message and come up with a single positive, cheerful and delightful verb in gerund form that's related to the message. Only include the word with no other text or punctuation. The word should have the first letter capitalized. Add some whimsy and surprise to entertain the user. Ensure the word is highly relevant to the user's message. Synonyms are welcome, including obscure words. Be careful to avoid words that might look alarming or concerning to the software engineer seeing it as a status notification, such as Connecting, Disconnecting, Retrying, Lagging, Freezing, etc. NEVER use a destructive word, such as Terminating, Killing, Deleting, Destroying, Stopping, Exiting, or similar. NEVER use a word that may be derogatory, offensive, or inappropriate in a non-coding context, such as Penetrating.
This is what the official docs have to say about it:
Haiku generation: Small creative messages that appear while you type (approximately 1 cent per day).
TIL: iOS App not using full height of the screen
Problem:
iOS app is not taking the whole height of the screen.
Reason:
“where I removed … the launch screen”
All iOS apps MUST have a launch screen storyboard, even if the rest of the app is done without storyboards.
Solution:
- Go to the target’s build settings
- Search for Launch Screen (Generation)
- Turn the value to Yes
TIL: AI Weekly Newsletter of Week 17
Dave Gauer invokes Peter Naur’s seminal essay “Programming as Theory Building” to argue that LLMs lack the essential “theory” human programmers develop through active engagement with a codebase, and thus cannot truly replace human developers.
To replace human programmers, LLMs would need to be able to build theories by Ryle’s definition or Naur must be wrong about the nature of programming.
I’m betting neither is true.
Andrej Karpathy’s tweet outlines a deliberate, step-by-step “inner loop” for AI-assisted coding—distinguishing serious, production-grade work from casual “vibe coding.” He emphasizes a controlled, incremental, and learning-focused workflow that:
Vibe coding—rapid, AI-driven code generation based on conversational prompts—can supercharge productivity, but Addy Osmani warns it must never be an excuse for substandard engineering. He frames AI as “your intern, not your replacement,” and provides a concise field guide for responsible AI-assisted development.
Let’s acknowledge the good: AI-assisted coding can be a game-changer.
…
However, as any seasoned engineer will tell you, speed means nothing if the wheels fall off down the road.
…
Think of these as the new “move fast, but don’t break everything” handbook—a set of guardrails to keep quality high when you’re vibing with the code.
Rules:
OpenAI is bringing the natively multimodal model that powers ChatGPT to the API via gpt-image-1
, enabling developers and businesses to integrate professional-grade image generation directly into their own tools and platforms.
Dia is a 1.6 B-parameter text-to-speech model from Nari Labs, capable of generating ultra-realistic dialogue in a single pass.
Describe Anything is a state-of-the-art framework for detailed localized captioning (DLC) that empowers users to obtain rich, context-aware descriptions of specific regions within images or videos—specified via points, boxes, scribbles, or masks. It sets a new standard by combining innovative architecture, scalable data strategies, and robust evaluation.
Devin AI has launched DeepWiki, a free tool that generates structured, wiki-style documentation for any GitHub repository. It simplifies understanding unfamiliar codebases by providing a comprehensive overview directly from the repo URL.
TIL: Top AI News of Week 16
The essay “AI as Normal Technology” by Arvind Narayanan and Sayash Kapoor, presents a perspective that contrasts with both utopian and dystopian narratives surrounding artificial intelligence (AI). Instead of viewing AI as an autonomous, potentially superintelligent entity, the authors argue that AI should be considered a “normal technology”—a tool that, while transformative, remains under human control and integrates gradually into society.
To view AI as normal is not to understate its impact—even transformative, general-purpose technologies such as electricity and the internet are “normal” in our conception. But it is in contrast to both utopian and dystopian visions of the future of AI which have a common tendency to treat it akin to a separate species, a highly autonomous, potentially superintelligent entity.
The statement “AI is normal technology” is three things: a description of current AI, a prediction about the foreseeable future of AI, and a prescription about how we should treat it.
…
A note to readers. This essay has the unusual goal of stating a worldview rather than defending a proposition. The literature on AI superintelligence is copious. We have not tried to give a point-by-point response to potential counter arguments, as that would make the paper several times longer. This paper is merely the initial articulation of our views; we plan to elaborate on them in various follow ups.
The article discusses the ongoing debate in the AI community between two approaches: using large language models (LLMs) directly versus integrating them with more structured workflows. This debate is highlighted by the recent release of OpenAI’s “Practical Guide to Building Agents,” which has received mixed reviews compared to Anthropic’s equivalent guide.
At the heart of the battle is a core tension we’ve discussed several times on the pod - team “Big Model take the wheel” vs team “nooooo we need to write code” (what used to be called chains, now it seems the term “workflows” has won).
…
You should read Harrison’s full rebuttal for the argument, but minus the LangGraph specific parts, the argument that stood out best to me was that you can replace every LLM call in a workflow with an agent and still have an agentic system.
On April 16, 2025, OpenAI introduced two new AI models—o3 and o4-mini—marking significant advancements in reasoning capabilities and tool integration within ChatGPT.
For the first time, our reasoning models can agentically use and combine every tool within ChatGPT
…
The combined power of state-of-the-art reasoning with full tool access translates into significantly stronger performance across academic benchmarks and real-world tasks, setting a new standard in both intelligence and usefulness.
The article concludes that OpenAI’s o3 and o4-mini significantly advance AI capabilities, challenging Google’s dominance. These models integrate perception, action, and reasoning into a cohesive system, marking a new dimension in AI. Though not perfect, their capabilities and cost-effectiveness position them as strong competitors. Further testing is needed to fully grasp their potential and limitations.
o3 and o4-mini are the first AI systems to approach full interactivity across three layers: modalities (perception), tools (action), and disciplines (cognition). Senses, limbs, cortex.
In a way, this release marks the end of “AI model” as a useful category. We kept calling them models out of habit. But these should have been called systems all along.
The Codex CLI is open-sourced. Don't confuse yourself with the old Codex language model built by OpenAI many moons ago (this is understandably top of mind for you!). Within this context, Codex refers to the open-source agentic coding interface. [...]
I like that the prompt describes OpenAI’s previous Codex language model as being from “many moons ago”. Prompt engineering is so weird.
TIL: Top AI News of Week 15
This website is a detailed scenario analysis predicting the potential impact of superhuman AI. Beside the content, this website provides a very good UX, the whole page is interactive. You can find the right side graph dynamic updates while scrolling.
AI 2027 is a comprehensive and detailed scenario forecast of the future of AI. It starts in 2025 and projects the rise of AI agents by 2026, the complete automation of coding in early 2027, and the intelligence explosion in late 2027. It has two branches, one ending in AI takeover and another ending in utopia (sort of).
The article emphasizes the importance of AI in modern software engineering teams and provides practical advice for engineering leaders on how to integrate AI tools effectively. It highlights the need for leaders to guide their teams through this transition and create a supportive environment for experimentation.
Seeing is believing First things first: try it yourself. … So get your hands dirty… Your personal experience matters here—without it, it’s all just theories in your head; you’ll be out of touch. … I gave him/her the honest truth: “It’s impossible for me to promise job security for anyone. But think about this logically: we hired you before AI was a thing, and now that you’re significantly more productive with these tools, why would we let you go?”
Companies don’t (usually) fire their most productive people; they invest in them. The real threat isn’t AI—it’s sticking to outdated ways while the industry evolves around you. … The shift to AI-assisted engineering isn’t coming—it’s here.
The article discusses how the relationship between domain experts and developers has changed in the AI era, emphasizing the increasing importance of domain expertise in building successful AI products. It’s interesting to see the sentence and picture of “I’ve got a great idea for an app…”.
… But today, I’m seeing teams of domain experts wading into the field, hiring a programmer or two to handle the implementation, while the experts themselves provide the prompts, data labeling, and evaluations.
For these companies, the coding is commodified but the domain expertise is the differentiator.
Google Cloud Next 2025, held in Las Vegas, showcased over 200 announcements emphasizing advancements in AI, infrastructure, and real-world applications.
Here’s a concise summary of AI & Machine Learning:
The article argues that Google and DeepMind have taken a significant lead in the AI landscape, outpacing competitors like OpenAI, Anthropic, and Meta. The author, who previously had high hopes for OpenAI, now believes that Google’s strategic and technical advancements have positioned it as the dominant force in AI.
(PSA: Many people are interested in this post, so I removed the paywall)
2025-04-10 TIL: AI Model Performance Analysis
This analysis highlights the tradeoffs between model size, speed, and tool-calling capabilities across different AI models. Llama3.1:8b appears to offer the best balance of features, while llama3.2:3b provides a good compromise between size and functionality.
Model | Size | Speed | Tool Calling | Attributes & Notes |
---|---|---|---|---|
llama3.1:8b | 4.9G | Good | Very well | Excellent tool calling capabilities with reasonable speed |
llama3.2:3b | 2.0G | Fast | Well | Fast performance with tool calling ability; provides confidence to continue testing. Sometimes lacks accuracy |
qwen2.5:7b | 4.7G | Fast | Well | Fast performance with good tool calling capabilities. Performance degrades after extended use |
qwq:32b | 19G | Slow | Well | Produces high-quality thought processes that are satisfying. Extremely slow, causing frustration during testing |
qwen2.5:0.5b | 397MB | Super fast | No | Exceptional speed due to small size. Unable to call any tools |
MFDoom/deepseek-r1-tool-calling:8b | 4.9G | Somewhat slow | Poor | Occasionally demonstrates thought processes. Slow performance with inadequate tool calling |
phi4-mini:3.8b | 2.5G | Fast | No | Fast performance and can identify tools. Unable to use the tools it identifies |
mistral:7b | 4.1G | Fast | No | Fast performance and can identify tools. Unable to use the tools it identifies |
ishumilin/deepseek-r1-coder-tools:1.5b | 3.6G | Fast | No | Fast performance and recognizes tools. Cannot answer questions or execute tools |
2025-03-24 TIL: Exploring .NET SDK Global.json, Google Map Tiles, and Vibe Coding
In the context of the .NET SDK, global.json is a file that allows you to specify which .NET SDK version to use when running .NET CLI commands, ensuring consistent builds across different development environments.
{
"sdk": {
"version": "8.0.404"
}
}
Google offers 2D, 3D (Photorealistic), and Street View tiles via the Map Tiles API, enabling developers to build immersive and customized map visualizations, including access to roadmap, terrain, satellite imagery, and street-level views
2D Tiles:
Street View Tiles:
There’s a new kind of coding I call “vibe coding”, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It’s possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good.
…while complaining about how the definition of “vibe coding” is already being distorted to mean “any time an LLM writes code” as opposed to the intended meaning of “code I wrote with an LLM without even reviewing what it wrote”.
Personally I use “vibe coding” when I feel like this dog.
2025-03-18 TIL: GeoParquet and .NET MAUI HybridWebView Control
Geospatial data in Parquet. Apache Parquet is a powerful column-oriented data format, designed as a modern alternative to CSV files. While Parquet excels at storing large and complex datasets, it lacks native geospatial support—which led to the creation of GeoParquet.
GeoParquet is an incubating Open Geospatial Consortium (OGC) standard that introduces interoperable geospatial types (Point, Line, Polygon) to Parquet.
HybridWebView is a new control introduced in .NET MAUI 9 that enables hosting HTML/JS/CSS content in a WebView while allowing two-way communication between JavaScript (inside the WebView) and C#/.NET (the host application).
For example, you can embed an existing React JS application within a cross-platform .NET MAUI native app, using C# and .NET for the backend.
2025-03-12 TIL: Model Context Protocol (MCP) and OpenAI new tools for building Agent
MCP is an open protocol that standardizes how applications provide context to LLMs. Think of MCP as a USB-C port for AI applications—just as USB-C offers a standardized way to connect devices to various peripherals, MCP provides a universal way to link AI models with different data sources and tools.
MCP helps you build agents and complex workflows on top of LLMs, ensuring seamless integration with data and tools. It provides:
🔗 Resources:
A new set of APIs and tools specifically designed to simplify the development of agentic applications:
Similarities:
Differences:
2025-03-11 TIL: GitHub Copilot in Visual Studio 2022
A unified AI interface that combines Ask, Edit, and Agent modes to help you write, edit, and understand code directly in your editor. The interface offers three modes, selectable from the mode picker:
2025-03-10 TIL: Arcade Expression
ArcGIS Arcade, a basic scripting language from Esri, helps you map the values you need or create completely new data values in minutes.
Example: Constructing an ArcGIS Earth App Link from a Feature Popup:
var geom = Geometry($feature);
var x = geom.x;
var y = geom.y;
// Handle the Z-coordinate, defaulting to 500 if null or empty
var z = DefaultValue(geom.z, 500);
// Function to convert Web Mercator to WGS84
function WebMercatorToWGS84(pt) {
var x = pt.x / 20037508.34 * 180;
var y = pt.y / 20037508.34 * 180;
y = 180 / PI * (2 * Atan(Exp(y*PI / 180)) - PI / 2);
return { 'x': x, 'y': y };
}
if (geom.spatialReference.wkid == 102100) {
var coords = WebMercatorToWGS84(geom);
x = coords.x;
y = coords.y;
}
url = "[www.arcgis.com/home/item...](https://www.arcgis.com/home/item.html?id=7de2582460bb47e284b2adc5e6a6753b)";
url = UrlEncode(url)
var earthLink = "[earth.arcgis.app](https://earth.arcgis.app/?viewpoint=cam:)" + x + "," + y + "," + z + ";" + "" + "," + "" + ";&url=" + url;
return earthLink;
2025-02-26 TIL: Quid Pro Quo and Self-Adhesive Label Stickers
/ˌkwɪd prəʊ ˈkwəʊ/
Also known as index paper, self-adhesive label stickers were originally coated with glue, requiring users to lick them before sticking them to ledger pages or documents for notes or numbering. This practice led to the name 口取纸, which refers to licking before use.
2025-02-25 TIL: On-Premises Software and NewsNow
On-premises software is installed and runs on computers within the premises of the person or organization using the software, rather than at a remote facility such as a server farm or the cloud (SaaS).
Elegant real-time news reading.
2025-02-24 TIL: One for the Money Rhyme and Its Use in Popular Music
“One for the Money” is an English-language rhyme. Children have used it as early as the 1820s to count before starting a race or other activity.
The full rhyme reads:
One for the money,
Two for the show;
Three to make ready,
And four to go.
The rhyme has been used or referenced in popular music since the 1950s. Here is a Spotify playlist: https://open.spotify.com/playlist/0DPCAFpstp5pQneqJasAoa
2025-01-19 TIL: Scheduled Tasks in ChatGPT & Gaelic Football
In this early beta feature, you can create scheduled tasks to enable ChatGPT to run automated prompts and proactively reach out to you at set times. This feature helps automate repetitive tasks and improves productivity by providing reminders or personalized interactions. Examples of Scheduled Tasks:
Gaelic football is a fast-paced, action-packed sport that combines elements of soccer, basketball, and some of the world’s fastest sports.
2025-01-18 TIL: How to Separate a Sectional Sofa & Types of Pianos
Sectional sofas are designed with hooks and latches that connect individual pieces, making them easy to detach, move, and rearrange.
There are three primary types of pianos: Grand, Upright, and Electric. The average upright piano weighs between 200-1,000 pounds.
2025-01-15 TIL: Placify and Artab
At Placify, we aim to Create a truly personal map—one that lets you gather and organize all the place-related content you’ve saved across different platforms.
Get Inspired by the World’s Greatest Artworks in Every New Tab.
2025-01-14 TIL: WWE’s “Raw” and “SmackDown”
WWE’s “Monday Night Raw” and “Friday Night SmackDown” are the company’s two flagship programs, each offering unique content and styles to engage a wide audience.
Personally, I found some segments uncomfortable to watch during last Saturday’s broadcast.
2025-01-13 TIL: Custom URI Schemes and Standard Web Links
Custom URI schemes are application-specific protocols that allow apps to communicate with each other or be launched directly. They are defined by developers to enable deep linking into specific parts of an app.
myapp://section/page
can open the “page” in the “section” of the app named “myapp”.Standard web links use common protocols like HTTP or HTTPS to direct users to web content. When integrated with app linking mechanisms, they can open corresponding content within an app if installed, or fallback to a web browser if not.
https://www.example.com/page
can open the corresponding page within an app if installed; otherwise, it opens in the browser.Here’s how these linking mechanisms are represented across major platforms:
Platform | Custom URI Schemes | Standard Web Links |
---|---|---|
iOS | URL Schemes (myapp:// ) |
Universal Links (https://www.example.com/page ) |
Android | Deep Links (myapp:// ) |
App Links (https://www.example.com/page ) |
Windows | URI Schemes (myapp:// ) |
Web-to-App Links (https://www.example.com/page ) |
Key Differences:
2025-01-09 TIL: WIIFM Principle & the Meaning of Level
WIIFM stands for “What’s In It For Me”—a simple yet powerful question people naturally ask when facing change.
Level has multiple meanings depending on the context:
2025-01-09 TIL: city-roads, Earth Peel, llama-ocr & Rendezvous Meaning
city-roads allows you to render every road in any city at once using WebGL.
The Earth Peel explores how projections flatten Earth’s spherical surface into a 2D map.
llama-ocr converts documents to Markdown using Llama 3.2 Vision.
Rendezvous (noun) /ˈrɑːn.deɪ.vuː/
2025-01-08 TIL: ArcGIS Object Store, HTML’s Importance, htmx, and the Richter Scale
Object storage solutions are designed to handle large volumes of unstructured data. With ArcGIS Enterprise 11.4, the object store now serves as the primary location for:
HTML is arguably the most significant computing language ever created. Its simplicity and accessibility are its greatest strengths.
What other programmers might say dismissively is something HTML lovers embrace: Anyone can do it. Whether we’re using complex frameworks or very simple tools, HTML’s promise is that we can build, make, code, and do anything we want.
htmx (also stylized as HTMX) is an open-source front-end JavaScript library that extends HTML with custom attributes that enable the use of AJAX directly in HTML and with a hypermedia-driven approach.
The Richter Magnitude Scale is a system for measuring the size of earthquakes, developed by Charles F. Richter in 1935. Initially designed for local earthquakes, it became a foundation for subsequent seismic scales.
2025-01-07 TIL: Automatic Change Detection, Spatial ETL & 汉语新解
Automated change detection describes the process of using algorithms and/or machine learning to identify areas at which land cover changes between two time points.
Spatial extract, transform, load (spatial ETL), also known as geospatial transformation and load (GTL), is a process for managing and manipulating geospatial data, for example map data. It is a type of extract, transform, load (ETL) process, with software tools and libraries specialised for geographical information.
This concept involves reinterpreting Chinese vocabulary from a fresh perspective, offering innovative explanations of traditional words.
;; 作者: 李继刚 ;; 版本: 0.3 ;; 模型: Claude Sonnet ;; 用途: 将一个汉语词汇进行全新角度的解释
;; 设定如下内容为你的 System Prompt (defun 新汉语老师 () “你是年轻人,批判现实,思考深刻,语言风趣” (风格 . (“Oscar Wilde” “鲁迅” “罗永浩”)) (擅长 . 一针见血) (表达 . 隐喻) (批判 . 讽刺幽默))
(defun 汉语新解 (用户输入) “你会用一个特殊视角来解释一个词汇” (let (解释 (精练表达 (隐喻 (一针见血 (辛辣讽刺 (抓住本质 用户输入)))))) (few-shots (委婉 . “刺向他人时, 决定在剑刃上撒上止痛药。")) (SVG-Card 解释)))
(defun SVG-Card (解释) “输出SVG 卡片” (setq design-rule “合理使用负空间,整体排版要有呼吸感” design-principles ‘(干净 简洁 典雅))
(设置画布 ‘(宽度 400 高度 600 边距 20)) (标题字体 ‘毛笔楷体) (自动缩放 ‘(最小字号 16))
(配色风格 ‘((背景色 (蒙德里安风格 设计感))) (主要文字 (汇文明朝体 粉笔灰)) (装饰图案 随机几何图))
(卡片元素 ((居中标题 “汉语新解”) 分隔线 (排版输出 用户输入 英文 日语) 解释 (线条图 (批判内核 解释)) (极简总结 线条图))))
(defun start () “启动时运行” (let (system-role 新汉语老师) (print “说吧, 他们又用哪个词来忽悠你了?")))
;; 运行规则 ;; 1. 启动时必须运行 (start) 函数 ;; 2. 之后调用主函数 (汉语新解 用户输入)
2025-01-05 TIL: Video Downloading with yt-dlp & Audio Transcription with 通义听悟
yt-dlp is a powerful command-line tool for downloading audio and video from thousands of websites.
yt-dlp -F https://www.bilibili.com/video/{video-id}/
yt-dlp -f 30032+30232 https://www.bilibili.com/video/{video-id}/
Tingwu is an AI-powered assistant by Alibaba Cloud, designed for audio and video transcription, searching, summarization, and organization.