Compare commits
1 Commits
master
...
dependabot
Author | SHA1 | Date |
---|---|---|
dependabot-preview[bot] | b0cd2e7928 |
|
@ -11,6 +11,3 @@ __pycache__/
|
||||||
.editorconfig
|
.editorconfig
|
||||||
.*.swp
|
.*.swp
|
||||||
config.json
|
config.json
|
||||||
venv/
|
|
||||||
*.log
|
|
||||||
filter.txt
|
|
33
README.md
33
README.md
|
@ -8,7 +8,7 @@ This version makes quite a few changes from [the original](https://github.com/Je
|
||||||
- Doesn't unnecessarily redownload all toots every time
|
- Doesn't unnecessarily redownload all toots every time
|
||||||
|
|
||||||
## FediBooks
|
## FediBooks
|
||||||
Before you use mstdn-ebooks to create your own ebooks bot, I recommend checking out [FediBooks(Broken link)](https://fedibooks.com). Compared to mstdn-ebooks, FediBooks offers a few advantages:
|
Before you use mstdn-ebooks to create your own ebooks bot, I recommend checking out [FediBooks](https://fedibooks.com). Compared to mstdn-ebooks, FediBooks offers a few advantages:
|
||||||
- Hosted and maintained by someone else - you don't have to worry about updating, keeping the computer on, etc
|
- Hosted and maintained by someone else - you don't have to worry about updating, keeping the computer on, etc
|
||||||
- No installation required
|
- No installation required
|
||||||
- A nice UI for managing your bot(s)
|
- A nice UI for managing your bot(s)
|
||||||
|
@ -25,7 +25,7 @@ Like mstdn-ebooks, FediBooks is free, both as in free of charge and free to modi
|
||||||
Secure fetch (aka authorised fetches, authenticated fetches, secure mode...) is *not* supported by mstdn-ebooks, and will fail to download any posts from users on instances with secure fetch enabled. For more information, see [this wiki page](https://github.com/Lynnesbian/mstdn-ebooks/wiki/Secure-fetch).
|
Secure fetch (aka authorised fetches, authenticated fetches, secure mode...) is *not* supported by mstdn-ebooks, and will fail to download any posts from users on instances with secure fetch enabled. For more information, see [this wiki page](https://github.com/Lynnesbian/mstdn-ebooks/wiki/Secure-fetch).
|
||||||
|
|
||||||
## Install/usage Guide
|
## Install/usage Guide
|
||||||
An installation and usage guide is available [here(broken link)](https://cloud.lynnesbian.space/s/jozbRi69t4TpD95). It's primarily targeted at Linux, but it should be possible on BSD, macOS, etc. I've also put some effort into providing steps for Windows, but I can't make any guarantees as to its effectiveness.
|
An installation and usage guide is available [here](https://cloud.lynnesbian.space/s/jozbRi69t4TpD95). It's primarily targeted at Linux, but it should be possible on BSD, macOS, etc. I've also put some effort into providing steps for Windows, but I can't make any guarantees as to its effectiveness.
|
||||||
|
|
||||||
### Docker
|
### Docker
|
||||||
While there is a Docker version provided, it is **not guaranteed to work**. I personally don't use Docker and don't know how the Dockerfile works; it was create over a year ago by someone else and hasn't been updated since. It might work for you, it might not. If you'd like to help update the Dockerfile, please get in touch with me on the Fediverse.
|
While there is a Docker version provided, it is **not guaranteed to work**. I personally don't use Docker and don't know how the Dockerfile works; it was create over a year ago by someone else and hasn't been updated since. It might work for you, it might not. If you'd like to help update the Dockerfile, please get in touch with me on the Fediverse.
|
||||||
|
@ -48,19 +48,18 @@ I recommend that you create your bot's account on a Mastodon instance. Creating
|
||||||
## Configuration
|
## Configuration
|
||||||
Configuring mstdn-ebooks is accomplished by editing `config.json`. If you want to use a different file for configuration, specify it with the `--cfg` argument. For example, if you want to use `/home/lynne/c.json` instead, you would run `python3 main.py --cfg /home/lynne/c.json` instead of just `python3 main.py`
|
Configuring mstdn-ebooks is accomplished by editing `config.json`. If you want to use a different file for configuration, specify it with the `--cfg` argument. For example, if you want to use `/home/lynne/c.json` instead, you would run `python3 main.py --cfg /home/lynne/c.json` instead of just `python3 main.py`
|
||||||
|
|
||||||
| Setting | Default | Meaning |
|
| Setting | Default | Meaning |
|
||||||
|--------------------------|-----------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
|--------------------|------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||||
| site | https://botsin.space | The instance your bot will log in to and post from. This must start with `https://` or `http://` (preferably the latter) |
|
| site | https://botsin.space | The instance your bot will log in to and post from. This must start with `https://` or `http://` (preferably the latter) |
|
||||||
| cw | null | The content warning (aka subject) mstdn-ebooks will apply to non-error posts. |
|
| cw | null | The content warning (aka subject) mstdn-ebooks will apply to non-error posts. |
|
||||||
| cw_reply | false | If true, replies will be CW'd |
|
| instance_blacklist | ["bofa.lol", "witches.town", "knzk.me"] | If your bot is following someone from a blacklisted instance, it will skip over them and not download their posts. This is useful for ensuring that mstdn-ebooks doesn't waste time trying to download posts from dead instances, without you having to unfollow the user(s) from them. |
|
||||||
| instance_blacklist | ["bofa.lol", "witches.town", "knzk.me"] | If your bot is following someone from a blacklisted instance, it will skip over them and not download their posts. This is useful for ensuring that mstdn-ebooks doesn't waste time trying to download posts from dead instances, without you having to unfollow the user(s) from them. |
|
| learn_from_cw | false | If true, mstdn-ebooks will learn from CW'd posts. |
|
||||||
| learn_from_cw | false | If true, mstdn-ebooks will learn from CW'd posts. |
|
| mention_handling | 1 | 0: Never use mentions. 1: Only generate fake mentions in the middle of posts, never at the start. 2: Use mentions as normal (old behaviour). |
|
||||||
| mention_handling | 1 | 0: Never use mentions. 1: Only generate fake mentions in the middle of posts, never at the start. 2: Use mentions as normal (old behaviour). |
|
| max_thread_length | 15 | The maximum number of bot posts in a thread before it stops replying. A thread can be 10 or 10000 posts long, but the bot will stop after it has posted `max_thread_length` times. |
|
||||||
| max_thread_length | 15 | The maximum number of bot posts in a thread before it stops replying. A thread can be 10 or 10000 posts long, but the bot will stop after it has posted `max_thread_length` times. |
|
| strip_paired_punctuation | false | If true, mstdn-ebooks will remove punctuation that commonly appears in pairs, like " and (). This avoids the issue of posts that open a bracket (or quote) without closing it. |
|
||||||
| strip_paired_punctuation | false | If true, mstdn-ebooks will remove punctuation that commonly appears in pairs, like " and (). This avoids the issue of posts that open a bracket (or quote) without closing it. |
|
|
||||||
| limit_length | false | If true, the sentence length will be random between `length_lower_limit` and `length_upper_limit` |
|
|
||||||
| length_lower_limit | 5 | The lower bound in the random number range above. Only matters if `limit_length` is true. |
|
|
||||||
| length_upper_limit | 50 | The upper bound in the random number range above. Can be the same as `length_lower_limit` to disable randomness. Only matters if `limit_length` is true. |
|
|
||||||
| overlap_ratio_enabled | false | If true, checks the output's similarity to the original posts. |
|
|
||||||
| overlap_ratio | 0.7 | The ratio that determins if the output is too similar to original or not. With decreasing ratio, both the interestingness of the output and the likelihood of failing to create output increases. Only matters if `overlap_ratio_enabled` is true. |
|
|
||||||
|
|
||||||
|
## Donating
|
||||||
|
Please don't feel obligated to donate at all.
|
||||||
|
|
||||||
|
- [Ko-Fi](https://ko-fi.com/lynnesbian) allows you to make one-off payments in increments of AU$3. These payments are not taxed.
|
||||||
|
- [PayPal](https://paypal.me/lynnesbian) allows you to make one-off payments of any amount in a range of currencies. These payments may be taxed.
|
||||||
|
|
|
@ -1,16 +0,0 @@
|
||||||
{
|
|
||||||
"site": "https://botsin.space",
|
|
||||||
"cw": null,
|
|
||||||
"instance_blacklist": ["bofa.lol", "witches.town", "knzk.me"],
|
|
||||||
"learn_from_cw": false,
|
|
||||||
"mention_handling": 1,
|
|
||||||
"max_thread_length": 15,
|
|
||||||
"strip_paired_punctuation": false,
|
|
||||||
"limit_length": false,
|
|
||||||
"length_lower_limit": 5,
|
|
||||||
"length_upper_limit": 50,
|
|
||||||
"overlap_ratio_enabled": false,
|
|
||||||
"overlap_ratio": 0.7,
|
|
||||||
"word_filter": 0,
|
|
||||||
"website": "https://git.nixnet.services/amber/amber-ebooks"
|
|
||||||
}
|
|
|
@ -1,3 +0,0 @@
|
||||||
@reboot $HOME/amber-ebooks/reply.py >> $HOME/reply.log 2>>$HOME/reply.log #keep the reply process running in the background
|
|
||||||
*/20 * * * * $HOME/amber-ebooks/gen.py >> $HOME/gen.log 2>>$HOME/gen.log #post every twenty minutes
|
|
||||||
*/15 * * * * $HOME/amber-ebooks/main.py >> $HOME/main.log 2>>$HOME/main.log #refresh the database every 15 minutes
|
|
|
@ -1,5 +0,0 @@
|
||||||
put
|
|
||||||
bad
|
|
||||||
words
|
|
||||||
in
|
|
||||||
filter.txt
|
|
66
functions.py
66
functions.py
|
@ -5,16 +5,14 @@
|
||||||
|
|
||||||
import markovify
|
import markovify
|
||||||
from bs4 import BeautifulSoup
|
from bs4 import BeautifulSoup
|
||||||
from random import randint
|
|
||||||
import re, multiprocessing, sqlite3, shutil, os, html
|
import re, multiprocessing, sqlite3, shutil, os, html
|
||||||
|
|
||||||
|
|
||||||
def make_sentence(output, cfg):
|
def make_sentence(output, cfg):
|
||||||
class nlt_fixed(markovify.NewlineText): # modified version of NewlineText that never rejects sentences
|
class nlt_fixed(markovify.NewlineText): #modified version of NewlineText that never rejects sentences
|
||||||
def test_sentence_input(self, sentence):
|
def test_sentence_input(self, sentence):
|
||||||
return True # all sentences are valid <3
|
return True #all sentences are valid <3
|
||||||
|
|
||||||
shutil.copyfile("toots.db", "toots-copy.db") # create a copy of the database because reply.py will be using the main one
|
shutil.copyfile("toots.db", "toots-copy.db") #create a copy of the database because reply.py will be using the main one
|
||||||
db = sqlite3.connect("toots-copy.db")
|
db = sqlite3.connect("toots-copy.db")
|
||||||
db.text_factory = str
|
db.text_factory = str
|
||||||
c = db.cursor()
|
c = db.cursor()
|
||||||
|
@ -27,27 +25,19 @@ def make_sentence(output, cfg):
|
||||||
output.send("Database is empty! Try running main.py.")
|
output.send("Database is empty! Try running main.py.")
|
||||||
return
|
return
|
||||||
|
|
||||||
nlt = markovify.NewlineText if cfg['overlap_ratio_enabled'] else nlt_fixed
|
model = nlt_fixed(
|
||||||
|
|
||||||
model = nlt(
|
|
||||||
"\n".join([toot[0] for toot in toots])
|
"\n".join([toot[0] for toot in toots])
|
||||||
)
|
)
|
||||||
|
|
||||||
db.close()
|
db.close()
|
||||||
os.remove("toots-copy.db")
|
os.remove("toots-copy.db")
|
||||||
|
|
||||||
if cfg['limit_length']:
|
toots_str = None
|
||||||
sentence_len = randint(cfg['length_lower_limit'], cfg['length_upper_limit'])
|
|
||||||
|
|
||||||
sentence = None
|
sentence = None
|
||||||
tries = 0
|
tries = 0
|
||||||
while sentence is None and tries < 10:
|
while sentence is None and tries < 10:
|
||||||
sentence = model.make_short_sentence(
|
sentence = model.make_short_sentence(500, tries=10000)
|
||||||
max_chars=500,
|
|
||||||
tries=10000,
|
|
||||||
max_overlap_ratio=cfg['overlap_ratio'] if cfg['overlap_ratio_enabled'] else 0.7,
|
|
||||||
max_words=sentence_len if cfg['limit_length'] else None
|
|
||||||
)
|
|
||||||
tries = tries + 1
|
tries = tries + 1
|
||||||
|
|
||||||
# optionally remove mentions
|
# optionally remove mentions
|
||||||
|
@ -56,57 +46,43 @@ def make_sentence(output, cfg):
|
||||||
elif cfg['mention_handling'] == 0:
|
elif cfg['mention_handling'] == 0:
|
||||||
sentence = re.sub(r"\S*@\u200B\S*\s?", "", sentence)
|
sentence = re.sub(r"\S*@\u200B\S*\s?", "", sentence)
|
||||||
|
|
||||||
# optionally regenerate the post if it has a filtered word. TODO: case-insensitivity, scuntthorpe problem
|
|
||||||
if cfg['word_filter'] == 1:
|
|
||||||
try:
|
|
||||||
fp = open('./filter.txt')
|
|
||||||
for word in fp:
|
|
||||||
word = re.sub("\n", "", word)
|
|
||||||
if word.lower() in sentence:
|
|
||||||
sentence=""
|
|
||||||
|
|
||||||
finally:
|
|
||||||
fp.close()
|
|
||||||
output.send(sentence)
|
output.send(sentence)
|
||||||
|
|
||||||
|
|
||||||
def make_toot(cfg):
|
def make_toot(cfg):
|
||||||
toot = None
|
toot = None
|
||||||
pin, pout = multiprocessing.Pipe(False)
|
pin, pout = multiprocessing.Pipe(False)
|
||||||
p = multiprocessing.Process(target=make_sentence, args=[pout, cfg])
|
p = multiprocessing.Process(target = make_sentence, args = [pout, cfg])
|
||||||
p.start()
|
p.start()
|
||||||
p.join(5) # wait 5 seconds to get something
|
p.join(5) #wait 5 seconds to get something
|
||||||
if p.is_alive(): # if it's still trying to make a toot after 5 seconds
|
if p.is_alive(): #if it's still trying to make a toot after 5 seconds
|
||||||
p.terminate()
|
p.terminate()
|
||||||
p.join()
|
p.join()
|
||||||
else:
|
else:
|
||||||
toot = pin.recv()
|
toot = pin.recv()
|
||||||
|
|
||||||
if toot is None:
|
if toot == None:
|
||||||
toot = "post failed"
|
toot = "Toot generation failed! Contact Lynne (lynnesbian@fedi.lynnesbian.space) for assistance."
|
||||||
return toot
|
return toot
|
||||||
|
|
||||||
|
|
||||||
def extract_toot(toot):
|
def extract_toot(toot):
|
||||||
toot = re.sub("<br>", "\n", toot)
|
toot = html.unescape(toot) # convert HTML escape codes to text
|
||||||
toot = html.unescape(toot) # convert HTML escape codes to text
|
|
||||||
soup = BeautifulSoup(toot, "html.parser")
|
soup = BeautifulSoup(toot, "html.parser")
|
||||||
for lb in soup.select("br"): # replace <br> with linebreak
|
for lb in soup.select("br"): # replace <br> with linebreak
|
||||||
lb.name = "\n"
|
lb.replace_with("\n")
|
||||||
|
|
||||||
for p in soup.select("p"): # ditto for <p>
|
for p in soup.select("p"): # ditto for <p>
|
||||||
p.name = "\n"
|
p.replace_with("\n")
|
||||||
|
|
||||||
for ht in soup.select("a.hashtag"): # convert hashtags from links to text
|
for ht in soup.select("a.hashtag"): # convert hashtags from links to text
|
||||||
ht.unwrap()
|
ht.unwrap()
|
||||||
|
|
||||||
for link in soup.select("a"): # convert <a href='https://example.com>example.com</a> to just https://example.com
|
for link in soup.select("a"): #ocnvert <a href='https://example.com>example.com</a> to just https://example.com
|
||||||
if 'href' in link:
|
if 'href' in link:
|
||||||
# apparently not all a tags have a href, which is understandable if you're doing normal web stuff, but on a social media platform??
|
# apparently not all a tags have a href, which is understandable if you're doing normal web stuff, but on a social media platform??
|
||||||
link.replace_with(link["href"])
|
link.replace_with(link["href"])
|
||||||
|
|
||||||
text = soup.get_text()
|
text = soup.get_text()
|
||||||
text = re.sub(r"https://([^/]+)/(@[^\s]+)", r"\2@\1", text) # put mastodon-style mentions back in
|
text = re.sub(r"https://([^/]+)/(@[^\s]+)", r"\2@\1", text) # put mastodon-style mentions back in
|
||||||
text = re.sub(r"https://([^/]+)/users/([^\s/]+)", r"@\2@\1", text) # put pleroma-style mentions back in
|
text = re.sub(r"https://([^/]+)/users/([^\s/]+)", r"@\2@\1", text) # put pleroma-style mentions back in
|
||||||
text = text.rstrip("\n") # remove trailing newline(s)
|
text = text.rstrip("\n") # remove trailing newline(s)
|
||||||
return text
|
return text
|
||||||
|
|
37
gen.py
37
gen.py
|
@ -8,11 +8,9 @@ import argparse, json, re
|
||||||
import functions
|
import functions
|
||||||
|
|
||||||
parser = argparse.ArgumentParser(description='Generate and post a toot.')
|
parser = argparse.ArgumentParser(description='Generate and post a toot.')
|
||||||
parser.add_argument(
|
parser.add_argument('-c', '--cfg', dest='cfg', default='config.json', nargs='?',
|
||||||
'-c', '--cfg', dest='cfg', default='config.json', nargs='?',
|
|
||||||
help="Specify a custom location for config.json.")
|
help="Specify a custom location for config.json.")
|
||||||
parser.add_argument(
|
parser.add_argument('-s', '--simulate', dest='simulate', action='store_true',
|
||||||
'-s', '--simulate', dest='simulate', action='store_true',
|
|
||||||
help="Print the toot without actually posting it. Use this to make sure your bot's actually working.")
|
help="Print the toot without actually posting it. Use this to make sure your bot's actually working.")
|
||||||
|
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
|
@ -23,10 +21,10 @@ client = None
|
||||||
|
|
||||||
if not args.simulate:
|
if not args.simulate:
|
||||||
client = Mastodon(
|
client = Mastodon(
|
||||||
client_id=cfg['client']['id'],
|
client_id=cfg['client']['id'],
|
||||||
client_secret=cfg['client']['secret'],
|
client_secret=cfg['client']['secret'],
|
||||||
access_token=cfg['secret'],
|
access_token=cfg['secret'],
|
||||||
api_base_url=cfg['site'])
|
api_base_url=cfg['site'])
|
||||||
|
|
||||||
if __name__ == '__main__':
|
if __name__ == '__main__':
|
||||||
toot = functions.make_toot(cfg)
|
toot = functions.make_toot(cfg)
|
||||||
|
@ -34,22 +32,11 @@ if __name__ == '__main__':
|
||||||
toot = re.sub(r"[\[\]\(\)\{\}\"“”«»„]", "", toot)
|
toot = re.sub(r"[\[\]\(\)\{\}\"“”«»„]", "", toot)
|
||||||
if not args.simulate:
|
if not args.simulate:
|
||||||
try:
|
try:
|
||||||
if toot == "":
|
client.status_post(toot, visibility = 'unlisted', spoiler_text = cfg['cw'])
|
||||||
print("Post has been filtered, or post generation has failed")
|
except Exception as err:
|
||||||
toot = functions.make_toot(cfg)
|
toot = "An error occurred while submitting the generated post. Contact lynnesbian@fedi.lynnesbian.space for assistance."
|
||||||
if toot == "":
|
client.status_post(toot, visibility = 'unlisted', spoiler_text = "Error!")
|
||||||
client.status_post("Recusrsion is a bitch. Post generation failed.", visibility='unlisted', spoiler_text=cfg['cw'])
|
|
||||||
else:
|
|
||||||
client.status_post(toot, visibility='unlisted', spoiler_text=cfg['cw'])
|
|
||||||
else:
|
|
||||||
client.status_post(toot, visibility='unlisted', spoiler_text=cfg['cw'])
|
|
||||||
except Exception:
|
|
||||||
toot = "@amber@toot.site Something went fucky"
|
|
||||||
client.status_post(toot, visibility='unlisted', spoiler_text="Error!")
|
|
||||||
try:
|
try:
|
||||||
if str(toot) == "":
|
print(toot)
|
||||||
print("Filtered")
|
|
||||||
else:
|
|
||||||
print(toot)
|
|
||||||
except UnicodeEncodeError:
|
except UnicodeEncodeError:
|
||||||
print(toot.encode("ascii", "ignore")) # encode as ASCII, dropping any non-ASCII characters
|
print(toot.encode("ascii", "ignore")) # encode as ASCII, dropping any non-ASCII characters
|
||||||
|
|
144
main.py
144
main.py
|
@ -5,33 +5,29 @@
|
||||||
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
|
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
|
||||||
|
|
||||||
from mastodon import Mastodon, MastodonUnauthorizedError
|
from mastodon import Mastodon, MastodonUnauthorizedError
|
||||||
import sqlite3, signal, sys, json, re, argparse
|
from os import path
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
import os, sqlite3, signal, sys, json, re, shutil, argparse
|
||||||
import requests
|
import requests
|
||||||
import functions
|
import functions
|
||||||
|
|
||||||
parser = argparse.ArgumentParser(description='Log in and download posts.')
|
parser = argparse.ArgumentParser(description='Log in and download posts.')
|
||||||
parser.add_argument('-c', '--cfg', dest='cfg', default='config.json', nargs='?', help="Specify a custom location for config.json.")
|
parser.add_argument('-c', '--cfg', dest='cfg', default='config.json', nargs='?',
|
||||||
|
help="Specify a custom location for config.json.")
|
||||||
|
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
|
|
||||||
scopes = ["read:statuses", "read:accounts", "read:follows", "write:statuses", "read:notifications", "write:accounts"]
|
scopes = ["read:statuses", "read:accounts", "read:follows", "write:statuses", "read:notifications", "write:accounts"]
|
||||||
# cfg defaults
|
#cfg defaults
|
||||||
|
|
||||||
cfg = {
|
cfg = {
|
||||||
"site": "https://botsin.space",
|
"site": "https://botsin.space",
|
||||||
"cw": None,
|
"cw": None,
|
||||||
"cw_reply": False,
|
"instance_blacklist": ["bofa.lol", "witches.town", "knzk.me"], # rest in piece
|
||||||
"instance_blacklist": ["bofa.lol", "witches.town", "knzk.me"], # rest in piece
|
|
||||||
"learn_from_cw": False,
|
"learn_from_cw": False,
|
||||||
"mention_handling": 1,
|
"mention_handling": 1,
|
||||||
"max_thread_length": 15,
|
"max_thread_length": 15,
|
||||||
"strip_paired_punctuation": False,
|
"strip_paired_punctuation": False
|
||||||
"limit_length": False,
|
|
||||||
"length_lower_limit": 5,
|
|
||||||
"length_upper_limit": 50,
|
|
||||||
"overlap_ratio_enabled": False,
|
|
||||||
"overlap_ratio": 0.7,
|
|
||||||
"word_filter": 0
|
|
||||||
}
|
}
|
||||||
|
|
||||||
try:
|
try:
|
||||||
|
@ -47,8 +43,7 @@ if not cfg['site'].startswith("https://") and not cfg['site'].startswith("http:/
|
||||||
|
|
||||||
if "client" not in cfg:
|
if "client" not in cfg:
|
||||||
print("No application info -- registering application with {}".format(cfg['site']))
|
print("No application info -- registering application with {}".format(cfg['site']))
|
||||||
client_id, client_secret = Mastodon.create_app(
|
client_id, client_secret = Mastodon.create_app("mstdn-ebooks",
|
||||||
"mstdn-ebooks",
|
|
||||||
api_base_url=cfg['site'],
|
api_base_url=cfg['site'],
|
||||||
scopes=scopes,
|
scopes=scopes,
|
||||||
website="https://github.com/Lynnesbian/mstdn-ebooks")
|
website="https://github.com/Lynnesbian/mstdn-ebooks")
|
||||||
|
@ -60,26 +55,23 @@ if "client" not in cfg:
|
||||||
|
|
||||||
if "secret" not in cfg:
|
if "secret" not in cfg:
|
||||||
print("No user credentials -- logging in to {}".format(cfg['site']))
|
print("No user credentials -- logging in to {}".format(cfg['site']))
|
||||||
client = Mastodon(
|
client = Mastodon(client_id = cfg['client']['id'],
|
||||||
client_id=cfg['client']['id'],
|
client_secret = cfg['client']['secret'],
|
||||||
client_secret=cfg['client']['secret'],
|
|
||||||
api_base_url=cfg['site'])
|
api_base_url=cfg['site'])
|
||||||
|
|
||||||
print("Open this URL and authenticate to give mstdn-ebooks access to your bot's account: {}".format(client.auth_request_url(scopes=scopes)))
|
print("Open this URL and authenticate to give mstdn-ebooks access to your bot's account: {}".format(client.auth_request_url(scopes=scopes)))
|
||||||
cfg['secret'] = client.log_in(code=input("Secret: "), scopes=scopes)
|
cfg['secret'] = client.log_in(code=input("Secret: "), scopes=scopes)
|
||||||
|
|
||||||
open(args.cfg, "w").write(re.sub(",", ",\n", json.dumps(cfg)))
|
json.dump(cfg, open(args.cfg, "w+"))
|
||||||
|
|
||||||
|
|
||||||
def extract_toot(toot):
|
def extract_toot(toot):
|
||||||
toot = functions.extract_toot(toot)
|
toot = functions.extract_toot(toot)
|
||||||
toot = toot.replace("@", "@\u200B") # put a zws between @ and username to avoid mentioning
|
toot = toot.replace("@", "@\u200B") #put a zws between @ and username to avoid mentioning
|
||||||
return(toot)
|
return(toot)
|
||||||
|
|
||||||
|
|
||||||
client = Mastodon(
|
client = Mastodon(
|
||||||
client_id=cfg['client']['id'],
|
client_id=cfg['client']['id'],
|
||||||
client_secret=cfg['client']['secret'],
|
client_secret = cfg['client']['secret'],
|
||||||
access_token=cfg['secret'],
|
access_token=cfg['secret'],
|
||||||
api_base_url=cfg['site'])
|
api_base_url=cfg['site'])
|
||||||
|
|
||||||
|
@ -92,10 +84,9 @@ except MastodonUnauthorizedError:
|
||||||
following = client.account_following(me.id)
|
following = client.account_following(me.id)
|
||||||
|
|
||||||
db = sqlite3.connect("toots.db")
|
db = sqlite3.connect("toots.db")
|
||||||
db.text_factory = str
|
db.text_factory=str
|
||||||
c = db.cursor()
|
c = db.cursor()
|
||||||
c.execute("CREATE TABLE IF NOT EXISTS `toots` (sortid INTEGER UNIQUE PRIMARY KEY AUTOINCREMENT, id VARCHAR NOT NULL, cw INT NOT NULL DEFAULT 0, userid VARCHAR NOT NULL, uri VARCHAR NOT NULL, content VARCHAR NOT NULL)")
|
c.execute("CREATE TABLE IF NOT EXISTS `toots` (sortid INTEGER UNIQUE PRIMARY KEY AUTOINCREMENT, id VARCHAR NOT NULL, cw INT NOT NULL DEFAULT 0, userid VARCHAR NOT NULL, uri VARCHAR NOT NULL, content VARCHAR NOT NULL)")
|
||||||
c.execute("CREATE TRIGGER IF NOT EXISTS `dedup` AFTER INSERT ON toots FOR EACH ROW BEGIN DELETE FROM toots WHERE rowid NOT IN (SELECT MIN(sortid) FROM toots GROUP BY uri ); END; ")
|
|
||||||
db.commit()
|
db.commit()
|
||||||
|
|
||||||
tableinfo = c.execute("PRAGMA table_info(`toots`)").fetchall()
|
tableinfo = c.execute("PRAGMA table_info(`toots`)").fetchall()
|
||||||
|
@ -118,7 +109,7 @@ if not found:
|
||||||
c.execute("CREATE TABLE `toots_temp` (sortid INTEGER UNIQUE PRIMARY KEY AUTOINCREMENT, id VARCHAR NOT NULL, cw INT NOT NULL DEFAULT 0, userid VARCHAR NOT NULL, uri VARCHAR NOT NULL, content VARCHAR NOT NULL)")
|
c.execute("CREATE TABLE `toots_temp` (sortid INTEGER UNIQUE PRIMARY KEY AUTOINCREMENT, id VARCHAR NOT NULL, cw INT NOT NULL DEFAULT 0, userid VARCHAR NOT NULL, uri VARCHAR NOT NULL, content VARCHAR NOT NULL)")
|
||||||
for f in following:
|
for f in following:
|
||||||
user_toots = c.execute("SELECT * FROM `toots` WHERE userid LIKE ? ORDER BY id", (f.id,)).fetchall()
|
user_toots = c.execute("SELECT * FROM `toots` WHERE userid LIKE ? ORDER BY id", (f.id,)).fetchall()
|
||||||
if user_toots is None:
|
if user_toots == None:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
if columns[-1] == "cw":
|
if columns[-1] == "cw":
|
||||||
|
@ -130,17 +121,14 @@ if not found:
|
||||||
|
|
||||||
c.execute("DROP TABLE `toots`")
|
c.execute("DROP TABLE `toots`")
|
||||||
c.execute("ALTER TABLE `toots_temp` RENAME TO `toots`")
|
c.execute("ALTER TABLE `toots_temp` RENAME TO `toots`")
|
||||||
c.execute("CREATE TRIGGER IF NOT EXISTS `dedup` AFTER INSERT ON toots FOR EACH ROW BEGIN DELETE FROM toots WHERE rowid NOT IN (SELECT MIN(sortid) FROM toots GROUP BY uri ); END; ")
|
|
||||||
|
|
||||||
db.commit()
|
db.commit()
|
||||||
|
|
||||||
|
|
||||||
def handleCtrlC(signal, frame):
|
def handleCtrlC(signal, frame):
|
||||||
print("\nPREMATURE EVACUATION - Saving chunks")
|
print("\nPREMATURE EVACUATION - Saving chunks")
|
||||||
db.commit()
|
db.commit()
|
||||||
sys.exit(1)
|
sys.exit(1)
|
||||||
|
|
||||||
|
|
||||||
signal.signal(signal.SIGINT, handleCtrlC)
|
signal.signal(signal.SIGINT, handleCtrlC)
|
||||||
|
|
||||||
patterns = {
|
patterns = {
|
||||||
|
@ -151,28 +139,29 @@ patterns = {
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
def insert_toot(post, acc, content, cursor): # extracted to prevent duplication
|
def insert_toot(oii, acc, post, cursor): # extracted to prevent duplication
|
||||||
|
pid = patterns["pid"].search(oii['object']['id']).group(0)
|
||||||
cursor.execute("REPLACE INTO toots (id, cw, userid, uri, content) VALUES (?, ?, ?, ?, ?)", (
|
cursor.execute("REPLACE INTO toots (id, cw, userid, uri, content) VALUES (?, ?, ?, ?, ?)", (
|
||||||
post['id'],
|
pid,
|
||||||
1 if (post['spoiler_text'] is not None and post['spoiler_text'] != "") else 0,
|
1 if (oii['object']['summary'] != None and oii['object']['summary'] != "") else 0,
|
||||||
acc.id,
|
acc.id,
|
||||||
post['uri'],
|
oii['object']['id'],
|
||||||
content
|
post
|
||||||
))
|
))
|
||||||
|
|
||||||
|
|
||||||
for f in following:
|
for f in following:
|
||||||
last_toot = c.execute("SELECT id FROM `toots` WHERE userid LIKE ? ORDER BY sortid DESC LIMIT 1", (f.id,)).fetchone()
|
last_toot = c.execute("SELECT id FROM `toots` WHERE userid LIKE ? ORDER BY sortid DESC LIMIT 1", (f.id,)).fetchone()
|
||||||
if last_toot is not None:
|
if last_toot != None:
|
||||||
last_toot = last_toot[0]
|
last_toot = last_toot[0]
|
||||||
else:
|
else:
|
||||||
last_toot = 0
|
last_toot = 0
|
||||||
print("Downloading posts for user @{}, starting from {}".format(f.acct, last_toot))
|
print("Downloading posts for user @{}, starting from {}".format(f.acct, last_toot))
|
||||||
|
|
||||||
# find the user's activitypub outbox
|
#find the user's activitypub outbox
|
||||||
print("WebFingering...")
|
print("WebFingering...")
|
||||||
instance = patterns["handle"].search(f.acct)
|
instance = patterns["handle"].search(f.acct)
|
||||||
if instance is None:
|
if instance == None:
|
||||||
instance = patterns["url"].search(cfg['site']).group(1)
|
instance = patterns["url"].search(cfg['site']).group(1)
|
||||||
else:
|
else:
|
||||||
instance = instance.group(1)
|
instance = instance.group(1)
|
||||||
|
@ -182,45 +171,87 @@ for f in following:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
try:
|
try:
|
||||||
# download first 20 toots since last toot
|
# 1. download host-meta to find webfinger URL
|
||||||
posts = client.account_statuses(f.id, min_id=last_toot)
|
r = requests.get("https://{}/.well-known/host-meta".format(instance), timeout=10)
|
||||||
|
# 2. use webfinger to find user's info page
|
||||||
|
uri = patterns["uri"].search(r.text).group(1)
|
||||||
|
uri = uri.format(uri = "{}@{}".format(f.username, instance))
|
||||||
|
r = requests.get(uri, headers={"Accept": "application/json"}, timeout=10)
|
||||||
|
j = r.json()
|
||||||
|
found = False
|
||||||
|
for link in j['links']:
|
||||||
|
if link['rel'] == 'self':
|
||||||
|
#this is a link formatted like "https://instan.ce/users/username", which is what we need
|
||||||
|
uri = link['href']
|
||||||
|
found = True
|
||||||
|
break
|
||||||
|
if not found:
|
||||||
|
print("Couldn't find a valid ActivityPub outbox URL.")
|
||||||
|
|
||||||
|
# 3. download first page of outbox
|
||||||
|
uri = "{}/outbox?page=true".format(uri)
|
||||||
|
r = requests.get(uri, timeout=15)
|
||||||
|
j = r.json()
|
||||||
except:
|
except:
|
||||||
print("oopsy woopsy!! we made a fucky wucky!!!\n(we're probably rate limited, please hang up and try again)")
|
print("oopsy woopsy!! we made a fucky wucky!!!\n(we're probably rate limited, please hang up and try again)")
|
||||||
sys.exit(1)
|
sys.exit(1)
|
||||||
|
|
||||||
|
pleroma = False
|
||||||
|
if 'next' not in j and 'prev' not in j:
|
||||||
|
# there's only one page of results, don't bother doing anything special
|
||||||
|
pass
|
||||||
|
elif 'prev' not in j:
|
||||||
|
print("Using Pleroma compatibility mode")
|
||||||
|
pleroma = True
|
||||||
|
if 'first' in j:
|
||||||
|
# apparently there used to be a 'first' field in pleroma's outbox output, but it's not there any more
|
||||||
|
# i'll keep this for backwards compatibility with older pleroma instances
|
||||||
|
# it was removed in pleroma 1.0.7 - https://git.pleroma.social/pleroma/pleroma/-/blob/841e4e4d835b8d1cecb33102356ca045571ef1fc/CHANGELOG.md#107-2019-09-26
|
||||||
|
j = j['first']
|
||||||
|
else:
|
||||||
|
print("Using standard mode")
|
||||||
|
uri = "{}&min_id={}".format(uri, last_toot)
|
||||||
|
r = requests.get(uri)
|
||||||
|
j = r.json()
|
||||||
|
|
||||||
print("Downloading and saving posts", end='', flush=True)
|
print("Downloading and saving posts", end='', flush=True)
|
||||||
done = False
|
done = False
|
||||||
try:
|
try:
|
||||||
while not done and len(posts) > 0:
|
while not done and len(j['orderedItems']) > 0:
|
||||||
for post in posts:
|
for oi in j['orderedItems']:
|
||||||
if post['reblog'] is not None:
|
if oi['type'] != "Create":
|
||||||
continue # this isn't a toot/post/status/whatever, it's a boost or a follow or some other activitypub thing. ignore
|
continue #this isn't a toot/post/status/whatever, it's a boost or a follow or some other activitypub thing. ignore
|
||||||
|
|
||||||
# its a toost baby
|
# its a toost baby
|
||||||
content = post['content']
|
content = oi['object']['content']
|
||||||
toot = extract_toot(content)
|
toot = extract_toot(content)
|
||||||
# print(toot)
|
# print(toot)
|
||||||
try:
|
try:
|
||||||
if c.execute("SELECT COUNT(*) FROM toots WHERE uri LIKE ?", (post['id'],)).fetchone()[0] > 0:
|
if pleroma:
|
||||||
# we've caught up to the notices we've already downloaded, so we can stop now
|
if c.execute("SELECT COUNT(*) FROM toots WHERE uri LIKE ?", (oi['object']['id'],)).fetchone()[0] > 0:
|
||||||
# you might be wondering, "lynne, what if the instance ratelimits you after 40 posts, and they've made 60 since main.py was last run? wouldn't the bot miss 20 posts and never be able to see them?" to which i reply, "i know but i don't know how to fix it"
|
#we've caught up to the notices we've already downloaded, so we can stop now
|
||||||
done = True
|
#you might be wondering, "lynne, what if the instance ratelimits you after 40 posts, and they've made 60 since main.py was last run? wouldn't the bot miss 20 posts and never be able to see them?" to which i reply, "i know but i don't know how to fix it"
|
||||||
|
done = True
|
||||||
|
continue
|
||||||
if 'lang' in cfg:
|
if 'lang' in cfg:
|
||||||
try:
|
try:
|
||||||
if post['language'] == cfg['lang']: # filter for language
|
if oi['object']['contentMap'][cfg['lang']]: # filter for language
|
||||||
insert_toot(post, f, toot, c)
|
insert_toot(oi, f, toot, c)
|
||||||
except KeyError:
|
except KeyError:
|
||||||
# JSON doesn't have language, just insert the toot irregardlessly
|
#JSON doesn't have contentMap, just insert the toot irregardlessly
|
||||||
insert_toot(post, f, toot, c)
|
insert_toot(oi, f, toot, c)
|
||||||
else:
|
else:
|
||||||
insert_toot(post, f, toot, c)
|
insert_toot(oi, f, toot, c)
|
||||||
pass
|
pass
|
||||||
except:
|
except:
|
||||||
pass # ignore any toots that don't successfully go into the DB
|
pass #ignore any toots that don't successfully go into the DB
|
||||||
|
|
||||||
# get the next <20 posts
|
# get the next/previous page
|
||||||
try:
|
try:
|
||||||
posts = client.account_statuses(f.id, min_id=posts[0]['id'])
|
if not pleroma:
|
||||||
|
r = requests.get(j['prev'], timeout=15)
|
||||||
|
else:
|
||||||
|
r = requests.get(j['next'], timeout=15)
|
||||||
except requests.Timeout:
|
except requests.Timeout:
|
||||||
print("HTTP timeout, site did not respond within 15 seconds")
|
print("HTTP timeout, site did not respond within 15 seconds")
|
||||||
except KeyError:
|
except KeyError:
|
||||||
|
@ -228,6 +259,7 @@ for f in following:
|
||||||
except:
|
except:
|
||||||
print("An error occurred while trying to obtain more posts.")
|
print("An error occurred while trying to obtain more posts.")
|
||||||
|
|
||||||
|
j = r.json()
|
||||||
print('.', end='', flush=True)
|
print('.', end='', flush=True)
|
||||||
print(" Done!")
|
print(" Done!")
|
||||||
db.commit()
|
db.commit()
|
||||||
|
@ -246,6 +278,6 @@ for f in following:
|
||||||
print("Done!")
|
print("Done!")
|
||||||
|
|
||||||
db.commit()
|
db.commit()
|
||||||
db.execute("VACUUM") # compact db
|
db.execute("VACUUM") #compact db
|
||||||
db.commit()
|
db.commit()
|
||||||
db.close()
|
db.close()
|
||||||
|
|
53
reply.py
53
reply.py
|
@ -4,12 +4,12 @@
|
||||||
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
|
# file, You can obtain one at http://mozilla.org/MPL/2.0/.
|
||||||
|
|
||||||
import mastodon
|
import mastodon
|
||||||
import re, json, argparse
|
import random, re, json, argparse
|
||||||
import functions
|
import functions
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
|
||||||
parser = argparse.ArgumentParser(description='Reply service. Leave running in the background.')
|
parser = argparse.ArgumentParser(description='Reply service. Leave running in the background.')
|
||||||
parser.add_argument(
|
parser.add_argument('-c', '--cfg', dest='cfg', default='config.json', nargs='?',
|
||||||
'-c', '--cfg', dest='cfg', default='config.json', nargs='?',
|
|
||||||
help="Specify a custom location for config.json.")
|
help="Specify a custom location for config.json.")
|
||||||
|
|
||||||
args = parser.parse_args()
|
args = parser.parse_args()
|
||||||
|
@ -17,23 +17,21 @@ args = parser.parse_args()
|
||||||
cfg = json.load(open(args.cfg, 'r'))
|
cfg = json.load(open(args.cfg, 'r'))
|
||||||
|
|
||||||
client = mastodon.Mastodon(
|
client = mastodon.Mastodon(
|
||||||
client_id=cfg['client']['id'],
|
client_id=cfg['client']['id'],
|
||||||
client_secret=cfg['client']['secret'],
|
client_secret=cfg['client']['secret'],
|
||||||
access_token=cfg['secret'],
|
access_token=cfg['secret'],
|
||||||
api_base_url=cfg['site'])
|
api_base_url=cfg['site'])
|
||||||
|
|
||||||
|
|
||||||
def extract_toot(toot):
|
def extract_toot(toot):
|
||||||
text = functions.extract_toot(toot)
|
text = functions.extract_toot(toot)
|
||||||
text = re.sub(r"^@[^@]+@[^ ]+\s*", r"", text) # remove the initial mention
|
text = re.sub(r"^@[^@]+@[^ ]+\s*", r"", text) #remove the initial mention
|
||||||
text = text.lower() # treat text as lowercase for easier keyword matching (if this bot uses it)
|
text = text.lower() #treat text as lowercase for easier keyword matching (if this bot uses it)
|
||||||
return text
|
return text
|
||||||
|
|
||||||
|
|
||||||
class ReplyListener(mastodon.StreamListener):
|
class ReplyListener(mastodon.StreamListener):
|
||||||
def on_notification(self, notification): # listen for notifications
|
def on_notification(self, notification): #listen for notifications
|
||||||
if notification['type'] == 'mention': # if we're mentioned:
|
if notification['type'] == 'mention': #if we're mentioned:
|
||||||
acct = "@" + notification['account']['acct'] # get the account's @
|
acct = "@" + notification['account']['acct'] #get the account's @
|
||||||
post_id = notification['status']['id']
|
post_id = notification['status']['id']
|
||||||
|
|
||||||
# check if we've already been participating in this thread
|
# check if we've already been participating in this thread
|
||||||
|
@ -46,7 +44,7 @@ class ReplyListener(mastodon.StreamListener):
|
||||||
posts = 0
|
posts = 0
|
||||||
for post in context['ancestors']:
|
for post in context['ancestors']:
|
||||||
if post['account']['id'] == me:
|
if post['account']['id'] == me:
|
||||||
pin = post["id"] # Only used if pin is called, but easier to call here
|
pin = post["id"] #Only used if pin is called, but easier to call here
|
||||||
posts += 1
|
posts += 1
|
||||||
if posts >= cfg['max_thread_length']:
|
if posts >= cfg['max_thread_length']:
|
||||||
# stop replying
|
# stop replying
|
||||||
|
@ -54,12 +52,12 @@ class ReplyListener(mastodon.StreamListener):
|
||||||
return
|
return
|
||||||
|
|
||||||
mention = extract_toot(notification['status']['content'])
|
mention = extract_toot(notification['status']['content'])
|
||||||
if (mention == "pin") or (mention == "unpin"): # check for keywords
|
if (mention == "pin") or (mention == "unpin"): #check for keywords
|
||||||
print("Found pin/unpin")
|
print("Found pin/unpin")
|
||||||
# get a list of people the bot is following
|
#get a list of people the bot is following
|
||||||
validusers = client.account_following(me)
|
validusers = client.account_following(me)
|
||||||
for user in validusers:
|
for user in validusers:
|
||||||
if user["id"] == notification["account"]["id"]: # user is #valid
|
if user["id"] == notification["account"]["id"]: #user is #valid
|
||||||
print("User is valid")
|
print("User is valid")
|
||||||
visibility = notification['status']['visibility']
|
visibility = notification['status']['visibility']
|
||||||
if visibility == "public":
|
if visibility == "public":
|
||||||
|
@ -67,25 +65,22 @@ class ReplyListener(mastodon.StreamListener):
|
||||||
if mention == "pin":
|
if mention == "pin":
|
||||||
print("pin received, pinning")
|
print("pin received, pinning")
|
||||||
client.status_pin(pin)
|
client.status_pin(pin)
|
||||||
client.status_post("Toot pinned!", post_id, visibility=visibility, spoiler_text=cfg['cw'])
|
client.status_post("Toot pinned!", post_id, visibility=visibility, spoiler_text = cfg['cw'])
|
||||||
else:
|
else:
|
||||||
print("unpin received, unpinning")
|
print("unpin received, unpinning")
|
||||||
client.status_post("Toot unpinned!", post_id, visibility=visibility, spoiler_text=cfg['cw'])
|
client.status_post("Toot unpinned!", post_id, visibility=visibility, spoiler_text = cfg['cw'])
|
||||||
client.status_unpin(pin)
|
client.status_unpin(pin)
|
||||||
else:
|
else:
|
||||||
print("User is not valid")
|
print("User is not valid")
|
||||||
else:
|
else:
|
||||||
toot = functions.make_toot(cfg) # generate a toot
|
toot = functions.make_toot(cfg) #generate a toot
|
||||||
if toot == "": # Regenerate the post if it contains a blacklisted word
|
toot = acct + " " + toot #prepend the @
|
||||||
toot = functions.make_toot(cfg)
|
print(acct + " says " + mention) #logging
|
||||||
toot = acct + " " + toot # prepend the @
|
|
||||||
print(acct + " says " + mention) # logging
|
|
||||||
visibility = notification['status']['visibility']
|
visibility = notification['status']['visibility']
|
||||||
if visibility == "public":
|
if visibility == "public":
|
||||||
visibility = "unlisted"
|
visibility = "unlisted"
|
||||||
client.status_post(toot, post_id, visibility=visibility, spoiler_text=cfg['cw'] if cfg['cw_reply'] else None) # send toost
|
client.status_post(toot, post_id, visibility=visibility, spoiler_text = cfg['cw']) #send toost
|
||||||
print("replied with " + toot) # logging
|
print("replied with " + toot) #logging
|
||||||
|
|
||||||
|
|
||||||
rl = ReplyListener()
|
rl = ReplyListener()
|
||||||
client.stream_user(rl) # go!
|
client.stream_user(rl) #go!
|
||||||
|
|
|
@ -1,4 +1,4 @@
|
||||||
Mastodon.py==1.5.1
|
Mastodon.py==1.5.1
|
||||||
markovify==0.8.2
|
markovify==0.8.2
|
||||||
beautifulsoup4==4.9.1
|
beautifulsoup4==4.9.3
|
||||||
requests==2.24.0
|
requests==2.24.0
|
||||||
|
|
Loading…
Reference in New Issue