Community Topics in Chatbot

DanCvrcek · January 19, 2026, 2:47pm

I hope I don't break any/too many rules here.

I've been playing with augmented LLM for a bit - the idea is to make this community's knowledge more accessible. What I've done:

preprocessed about 3 years of topics into structured MD files
generated a vector database
integrated with Anthropic LLM
connected to a chainlit chatbot.

It's on a relatively small EC2 so each query takes a couple of seconds.

I wonder if it could be useful, i.e., if someone can try it. https://axelspire.com/chatbot/chat/

aarongable · January 19, 2026, 4:03pm

Speaking only for myself here:

Per the Let's Encrypt Community Forum Terms of Service, all user-created content on this forum is licensed under CC BY-NC-SA 3.0. That means that it can be used and remixed by projects like this if and only if those projects provide attribution and are licensed under the same terms.

I don't see a CC BY-NC-SA license on your chatbot, but even more importantly, LLMs are structurally incapable of providing attribution to the posts and users from which they are drawing their generated text.

So while this project is interesting, I believe it is in violation of this community's terms of service and I'd like to ask you to take it down and delete the model.

DanCvrcek · January 19, 2026, 7:08pm

Hi, that's a good shout! Actually, each answer provided lists topics used for it. It uses a RAG control flow and seems to be very accurate in that respect. Not sure if you gave it a try, @aarongable .

I will review and either improve, if possible, or bin it.

webprofusion · January 20, 2026, 1:31am

They all get it a bit wrong in general and in particular when dealing with ACME clients you need to ingest their docs pages as well as community threads are rarely specific or accurate enough.

You should add a footer with the generic license attribution and as it's non-commercial you could not safely use it for commercial gain.

Personally I don't care about AI using the info, as the general models like ChatGPT have already scraped this info many thousands of times. I use AI every day, all the time, some others probably do too. Reading the docs is now a luxury activity for the time-rich, I accept that.

However models that try to be the documentation need to try extra hard to get it right and should favor authoritative answers.

Maiyannah · January 20, 2026, 4:36pm

This is a fatal flaw for this licensing model.

I am not a lawyer, but I can read court decisions, and it was adjudicated in Bartz v Anthropic that retaining data for training a LLM model is a violation of copyright. (Ref: Susman Godfrey Secures $1.5 Billion Settlement in Landmark AI Piracy Case | Susman Godfrey L.L.P. ) (A good background from Lawful Masses https://www.youtube.com/watch?v=XWY8QmLD5H4 - Leonard French, Esq is a practicing copyright attorney)

The CC licensing requires that you give attribution, do not use the work commercially, and must license the derivative material under the came license. Attribution is difficult, but it is a solveable problem if the portions of each that are used are attributed to their author. Non-commercial use has been held to exclude these materials being fed to commercial AI that then may ingest them to use for resold AI portions, but you could still use something like a local open source instance (such as a local Stable Diffusion instance for images). However, being able to license it same as the CC is probably fatal. According to the US Congress, generative AI content is not copyrightable and cannot be licensed: https://www.congress.gov/crs-product/LSB10922

I welcome someone with actual legal experience to contradict me, but that's my reading.

DanCvrcek · January 20, 2026, 9:40pm

interesting - based on a chat with a lawyer (off the top of his head). The safe way, in this particular case, is to 'be inspired', 'don't keep texts' and 'don't attribute'. Basically, strip out core technical facts, find public sources and build own knowledge base (with AI). What seems to be strange to me is that under CC BY-NC-SA you can't protect technical facts, only the way they are described.

Feels like something's not quite right.

Maiyannah · January 20, 2026, 10:09pm

Here in Canada, under the Copyright Act R.S.C., 1985, c. C-42, all works including technical works such as architectural plans or, in this case, software documentation and the discussions thereof, are copyrighted, the operant terms:

5 (1) Subject to this Act, copyright shall subsist in Canada, for the term hereinafter mentioned, in every original literary, dramatic, musical and artistic work if any one of the following conditions is met:

(a) in the case of any work, whether published or unpublished, including a cinematographic work, the author was, at the date of the making of the work, a citizen or subject of, or a person ordinarily resident in, a treaty country;

(b) in the case of a cinematographic work, whether published or unpublished, the maker, at the date of the making of the cinematographic work,

(i) if a corporation, had its headquarters in a treaty country, or

(ii) if a natural person, was a citizen or subject of, or a person ordinarily resident in, a treaty country; or

(c) in the case of a published work, including a cinematographic work,

(i) in relation to subparagraph 2.2(1)(a)(i), the first publication in such a quantity as to satisfy the reasonable demands of the public, having regard to the nature of the work, occurred in a treaty country, or

(ii) in relation to subparagraph 2.2(1)(a)(ii) or (iii), the first publication occurred in a treaty country.

The USA congress position on this was published above, which is reflective of their own legislation.

DanCvrcek · January 26, 2026, 6:41pm

An update on the chatbot RAG model. Based on the feedback here (and elsewhere) - I have re-built it from scratch. It doesn't use any expressions of ideas or quotations from this web site - i.e., anything subject of CC BY-NC-SA. The model generation used the content from here only to build a set of technical facts and I added other sources as well. Once cleaned up, I then generated full-text topics and guides from authoritative sources - using facts purely as "headings". These new AI/synthetic datasets are used in the chatbot model.

griffin · January 26, 2026, 7:44pm

@aarongable

For your review.

system · February 25, 2026, 7:44pm

This topic was automatically closed 30 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Improving instructions for complete newbies Feature Requests	62	6271	April 17, 2021
Better help requests through topic templates Site Feedback	22	7096	August 14, 2016
Request for Official Ongoingly-Updated Issuance Chain Topic Feature Requests	41	3010	June 4, 2021
Make the letsencrypt_bot on Freenode mention ##letsencrypt instead of mindless spamming Site Feedback	6	1230	November 6, 2018
500s when issuing 8 different certs Help	2	3183	September 27, 2017

Community Topics in Chatbot

Related topics