Show HN: Easy Webpage Summarizer – Quickly Summarize Webpages and YouTube Videos

https://github.com/cobanov/easy-web-summarizer

CI PyPI Python License: MIT

Summarize web pages and YouTube videos with pluggable LLM backends. Ships with first-class support for Ollama (local) and OpenAI, plus an optional Gradio web UI.

Installation

# Library + CLI, with Ollama backend
pip install 'websum[ollama]'
# With OpenAI backend
pip install 'websum[openai]'
# With the Gradio web UI
pip install 'websum[ui,ollama]'
# Everything
pip install 'websum[all]'

Using uv:

Quickstart

Library

from websum import Summarizer, OllamaBackend
s = Summarizer(backend=OllamaBackend(model="llama3:instruct"))
print(s.summarize("https://cobanov.dev/haftalik-bulten/hafta-13"))
print(s.summarize("https://www.youtube.com/watch?v=4pOpQwiUVXc"))
print(s.translate("Hello world", target_language="Turkish"))

Swap the backend without touching anything else:

from websum import Summarizer, OpenAIBackend
s = Summarizer(backend=OpenAIBackend(model="gpt-4o-mini"))

CLI

# Summarize a web page or YouTube URL (auto-detected)
websum summarize https://example.com
# Use OpenAI instead of Ollama
websum summarize https://example.com --backend openai --model gpt-4o-mini
# Translate
websum translate "Hello world" --target-language Turkish
# Launch the Gradio UI
websum ui --port 7860

Run websum --help for the full command reference.

API overview

Object Purpose
Summarizer High-level API. summarize(url), summarize_web(url), summarize_youtube(url), translate(text).
SummarizerConfig Chunking and language settings.
OllamaBackend, OpenAIBackend Built-in backends. Frozen dataclasses with .build().
LLMBackend (Protocol) Implement this to plug in any backend.
BackendRegistry Map string names to backend classes (used by the CLI).

All public names are re-exported from the top-level websum package and listed in __all__.

Custom backends

from dataclasses import dataclass
from websum import LLMBackend, Summarizer
@dataclass
class MyBackend:
    def build(self):
        from langchain_anthropic import ChatAnthropic
        return ChatAnthropic(model="claude-3-5-sonnet-latest")
assert isinstance(MyBackend(), LLMBackend)  # Protocol check
s = Summarizer(backend=MyBackend())

Docker

docker build -t websum .
docker run -p 7860:7860 websum
# Run when ollama is on the host
docker run --network host -p 7860:7860 websum

The image starts websum ui by default.

Migration from 0.1.x

The 0.1.x scripts under app/ (summarizer.py, translator.py, yt_summarizer.py, webui.py) are gone. Everything moved into the websum package with a typed, importable API.

Before After
python app/summarizer.py -u URL websum summarize URL
python app/webui.py websum ui
from summarizer import setup_summarization_chain from websum import Summarizer
Hardcoded ChatOllama OllamaBackend / OpenAIBackend / custom LLMBackend
pip install -r requirements.txt pip install 'websum[ollama]'

Development

git clone https://github.com/cobanov/websum
cd websum
uv sync --all-extras
uv run pre-commit install
uv run pytest
uv run ruff check .
uv run mypy src/websum

See CONTRIBUTING.md for the full guide.

License

MIT. See LICENSE.

{
"by": "cobanov",
"descendants": 0,
"id": 40226459,
"score": 11,
"text": "I&#x27;m excited to share a project. It&#x27;s a Python script that utilizes the LangChain framework and the ChatOllama model to generate concise summaries from webpages and YouTube videos. For those preferring a graphical interface, it includes a Gradio app that runs in the browser to use the summarizer interactively. Easily containerize and deploy the summarizer with Docker.<p>The tool is perfect for anyone needing quick insights without reading through the entire content&#x2F; It&#x27;s open for contributions, so if you&#x27;re interested in improving or extending its functionalities, feel free to dive in!",
"time": 1714584709,
"title": "Show HN: Easy Webpage Summarizer – Quickly Summarize Webpages and YouTube Videos",
"type": "story",
"url": "https://github.com/cobanov/easy-web-summarizer"
}
{
"author": "cobanov",
"date": null,
"description": "Summarize web pages and YouTube videos with pluggable LLM backends (Ollama, OpenAI). CLI, library, and Gradio UI. - cobanov/websum",
"image": "https://opengraph.githubassets.com/b5e038352d94bec96429e745ed2b2a0bbea623a7de4e87e17be3406f880781da/cobanov/websum",
"logo": null,
"publisher": "GitHub",
"title": "GitHub - cobanov/websum: Summarize web pages and YouTube videos with pluggable LLM backends (Ollama, OpenAI). CLI, library, and Gradio UI.",
"url": "https://github.com/cobanov/websum"
}
{
"url": "https://github.com/cobanov/easy-web-summarizer",
"title": "GitHub - cobanov/websum: Summarize web pages and YouTube videos with pluggable LLM backends (Ollama, OpenAI). CLI, library, and Gradio UI.",
"description": "Summarize web pages and YouTube videos with pluggable LLM backends. Ships with first-class support for Ollama (local) and OpenAI, plus an optional Gradio web UI. Installation # Library + CLI, with Ollama...",
"links": [
"https://github.com/cobanov/websum",
"https://github.com/cobanov/easy-web-summarizer"
],
"image": "https://opengraph.githubassets.com/b5e038352d94bec96429e745ed2b2a0bbea623a7de4e87e17be3406f880781da/cobanov/websum",
"content": "<div><article>\n<p><a target=\"_blank\" href=\"https://github.com/cobanov/websum/actions/workflows/ci.yml\"><img src=\"https://github.com/cobanov/websum/actions/workflows/ci.yml/badge.svg\" alt=\"CI\" /></a>\n<a target=\"_blank\" href=\"https://pypi.org/project/websum/\"><img src=\"https://camo.githubusercontent.com/a27dd05e4ff5946ae978a7a2b790aac16af3058885ca0afdbd0d5d30b08069fb/68747470733a2f2f696d672e736869656c64732e696f2f707970692f762f77656273756d3f7374796c653d666c61742d7371756172652663616368655365636f6e64733d333030\" alt=\"PyPI\" /></a>\n<a target=\"_blank\" href=\"https://pypi.org/project/websum/\"><img src=\"https://camo.githubusercontent.com/573f20110908c5a77f2ae14e07bccac9b47933d5276c9c01cad11ef565880a2a/68747470733a2f2f696d672e736869656c64732e696f2f707970692f707976657273696f6e732f77656273756d3f7374796c653d666c61742d7371756172652663616368655365636f6e64733d333030\" alt=\"Python\" /></a>\n<a target=\"_blank\" href=\"https://github.com/cobanov/websum/blob/main/LICENSE\"><img src=\"https://camo.githubusercontent.com/ac049ef4e7a0b7196b09add6ac2d4f180e544c0ac779c2b2ac2fd2723a209579/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c6963656e73652d4d49542d626c75653f7374796c653d666c61742d737175617265\" alt=\"License: MIT\" /></a></p>\n<p>Summarize web pages and YouTube videos with pluggable LLM backends. Ships with first-class support for <strong>Ollama</strong> (local) and <strong>OpenAI</strong>, plus an optional Gradio web UI.</p>\n<p></p><h2>Installation</h2><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#installation\"></a><p></p>\n<div><pre><span><span>#</span> Library + CLI, with Ollama backend</span>\npip install <span><span>'</span>websum[ollama]<span>'</span></span>\n<span><span>#</span> With OpenAI backend</span>\npip install <span><span>'</span>websum[openai]<span>'</span></span>\n<span><span>#</span> With the Gradio web UI</span>\npip install <span><span>'</span>websum[ui,ollama]<span>'</span></span>\n<span><span>#</span> Everything</span>\npip install <span><span>'</span>websum[all]<span>'</span></span></pre></div>\n<p>Using <code>uv</code>:</p>\n<p></p><h2>Quickstart</h2><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#quickstart\"></a><p></p>\n<p></p><h3>Library</h3><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#library\"></a><p></p>\n<div><pre><span>from</span> <span>websum</span> <span>import</span> <span>Summarizer</span>, <span>OllamaBackend</span>\n<span>s</span> <span>=</span> <span>Summarizer</span>(<span>backend</span><span>=</span><span>OllamaBackend</span>(<span>model</span><span>=</span><span>\"llama3:instruct\"</span>))\n<span>print</span>(<span>s</span>.<span>summarize</span>(<span>\"https://cobanov.dev/haftalik-bulten/hafta-13\"</span>))\n<span>print</span>(<span>s</span>.<span>summarize</span>(<span>\"https://www.youtube.com/watch?v=4pOpQwiUVXc\"</span>))\n<span>print</span>(<span>s</span>.<span>translate</span>(<span>\"Hello world\"</span>, <span>target_language</span><span>=</span><span>\"Turkish\"</span>))</pre></div>\n<p>Swap the backend without touching anything else:</p>\n<div><pre><span>from</span> <span>websum</span> <span>import</span> <span>Summarizer</span>, <span>OpenAIBackend</span>\n<span>s</span> <span>=</span> <span>Summarizer</span>(<span>backend</span><span>=</span><span>OpenAIBackend</span>(<span>model</span><span>=</span><span>\"gpt-4o-mini\"</span>))</pre></div>\n<p></p><h3>CLI</h3><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#cli\"></a><p></p>\n<div><pre><span><span>#</span> Summarize a web page or YouTube URL (auto-detected)</span>\nwebsum summarize https://example.com\n<span><span>#</span> Use OpenAI instead of Ollama</span>\nwebsum summarize https://example.com --backend openai --model gpt-4o-mini\n<span><span>#</span> Translate</span>\nwebsum translate <span><span>\"</span>Hello world<span>\"</span></span> --target-language Turkish\n<span><span>#</span> Launch the Gradio UI</span>\nwebsum ui --port 7860</pre></div>\n<p>Run <code>websum --help</code> for the full command reference.</p>\n<p></p><h2>API overview</h2><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#api-overview\"></a><p></p>\n<table>\n<thead>\n<tr>\n<th>Object</th>\n<th>Purpose</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td><code>Summarizer</code></td>\n<td>High-level API. <code>summarize(url)</code>, <code>summarize_web(url)</code>, <code>summarize_youtube(url)</code>, <code>translate(text)</code>.</td>\n</tr>\n<tr>\n<td><code>SummarizerConfig</code></td>\n<td>Chunking and language settings.</td>\n</tr>\n<tr>\n<td><code>OllamaBackend</code>, <code>OpenAIBackend</code></td>\n<td>Built-in backends. Frozen dataclasses with <code>.build()</code>.</td>\n</tr>\n<tr>\n<td><code>LLMBackend</code> (Protocol)</td>\n<td>Implement this to plug in any backend.</td>\n</tr>\n<tr>\n<td><code>BackendRegistry</code></td>\n<td>Map string names to backend classes (used by the CLI).</td>\n</tr>\n</tbody>\n</table>\n<p>All public names are re-exported from the top-level <code>websum</code> package and listed in <code>__all__</code>.</p>\n<p></p><h2>Custom backends</h2><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#custom-backends\"></a><p></p>\n<div><pre><span>from</span> <span>dataclasses</span> <span>import</span> <span>dataclass</span>\n<span>from</span> <span>websum</span> <span>import</span> <span>LLMBackend</span>, <span>Summarizer</span>\n<span>@<span>dataclass</span></span>\n<span>class</span> <span>MyBackend</span>:\n <span>def</span> <span>build</span>(<span>self</span>):\n <span>from</span> <span>langchain_anthropic</span> <span>import</span> <span>ChatAnthropic</span>\n <span>return</span> <span>ChatAnthropic</span>(<span>model</span><span>=</span><span>\"claude-3-5-sonnet-latest\"</span>)\n<span>assert</span> <span>isinstance</span>(<span>MyBackend</span>(), <span>LLMBackend</span>) <span># Protocol check</span>\n<span>s</span> <span>=</span> <span>Summarizer</span>(<span>backend</span><span>=</span><span>MyBackend</span>())</pre></div>\n<p></p><h2>Docker</h2><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#docker\"></a><p></p>\n<div><pre>docker build -t websum <span>.</span>\ndocker run -p 7860:7860 websum\n<span><span>#</span> Run when ollama is on the host</span>\ndocker run --network host -p 7860:7860 websum</pre></div>\n<p>The image starts <code>websum ui</code> by default.</p>\n<p></p><h2>Migration from 0.1.x</h2><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#migration-from-01x\"></a><p></p>\n<p>The 0.1.x scripts under <code>app/</code> (<code>summarizer.py</code>, <code>translator.py</code>, <code>yt_summarizer.py</code>, <code>webui.py</code>) are gone. Everything moved into the <code>websum</code> package with a typed, importable API.</p>\n<table>\n<thead>\n<tr>\n<th>Before</th>\n<th>After</th>\n</tr>\n</thead>\n<tbody>\n<tr>\n<td><code>python app/summarizer.py -u URL</code></td>\n<td><code>websum summarize URL</code></td>\n</tr>\n<tr>\n<td><code>python app/webui.py</code></td>\n<td><code>websum ui</code></td>\n</tr>\n<tr>\n<td><code>from summarizer import setup_summarization_chain</code></td>\n<td><code>from websum import Summarizer</code></td>\n</tr>\n<tr>\n<td>Hardcoded <code>ChatOllama</code></td>\n<td><code>OllamaBackend</code> / <code>OpenAIBackend</code> / custom <code>LLMBackend</code></td>\n</tr>\n<tr>\n<td><code>pip install -r requirements.txt</code></td>\n<td><code>pip install 'websum[ollama]'</code></td>\n</tr>\n</tbody>\n</table>\n<p></p><h2>Development</h2><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#development\"></a><p></p>\n<div><pre>git clone https://github.com/cobanov/websum\n<span>cd</span> websum\nuv sync --all-extras\nuv run pre-commit install\nuv run pytest\nuv run ruff check <span>.</span>\nuv run mypy src/websum</pre></div>\n<p>See <a target=\"_blank\" href=\"https://github.com/cobanov/websum/blob/main/CONTRIBUTING.md\">CONTRIBUTING.md</a> for the full guide.</p>\n<p></p><h2>License</h2><a target=\"_blank\" href=\"https://github.com/cobanov/easy-web-summarizer#license\"></a><p></p>\n<p>MIT. See <a target=\"_blank\" href=\"https://github.com/cobanov/websum/blob/main/LICENSE\">LICENSE</a>.</p>\n</article></div>",
"author": "",
"favicon": "https://github.githubassets.com/favicons/favicon.svg",
"source": "github.com",
"published": "",
"ttr": 72,
"type": "object"
}