Skip to content

Commit be2e1c7

Browse files
committed
feat: v0.2 — port XActions scraper patterns, ship real commands
v0.1 was scaffolding + transport. v0.2 makes the commands real by porting the production-grade subset of XActions' HTTP scraper (`reference/XActions/src/scrapers/twitter/http/*`) into a typed Go domain layer with the same stable rules: api/extract.go Shared map-walking helpers (getMap, getString, getInt with string→int for X's view counts, getBool, getSlice, walkPath, walkPathSlice, walkPathMap, copyMap). Single source of truth for the parser layer. api/tweets.go — ParseTweet projection - Recognizes Tweet | TweetWithVisibilityResults (unwrap one level) | TweetTombstone (typename → non-nil marker). - Recursive quote and retweet projection with a hard maxQuoteDepth=5 cap so a malicious or pathological response cannot drive the parser into stack exhaustion. - Picks the highest-bitrate video/mp4 variant for video URLs. - parseTimelineInstructions dispatches on TimelineAddEntries / TimelineAddToModule / TimelinePinEntry, with cursor-bottom-/ cursor-top-/tweet-/user- entry-ID prefix typing. - GetThread reconstructs a conversation from TweetDetail, filters to the focal tweet's author by default (self-thread), sorts chronologically. - resolveUserID is cached per Client via sync.Map so paginated reads and the monitor poll loop only pay the UserByScreenName roundtrip once per session. api/relationships.go — Followers / Following / Likers / Retweeters - ParseUserSummary projects user_results.result, drops UserUnavailable, bumps avatar from _normal to _400x400. - Generic scrapeUserList paginator with username-keyed Map dedup, progress callback, multi-page cursor advance. - Endpoint-specific instructions paths: Followers/Following → data.user.result.timeline.timeline.instructions Retweeters → data.retweeters_timeline.timeline.instructions Favoriters → data.favoriters_timeline.timeline.instructions api/search.go — SearchPosts / SearchUsers - Wraps SearchTimeline with the product flag (Latest|Top|People| Photos|Videos). - BuildAdvancedQuery composes from:, to:, since:, until:, min_faves:, min_retweets:, min_replies:, lang:, filter:, -filter: into the raw query string. - Dedicated parseSearchUserInstructions for product=People shape. api/actions.go — Follow / Unfollow + classifyMutationErrors - REST form POST to friendshipsCreate / friendshipsDestroy with user_id + include_profile_interstitial_type=1 + skip_status=true, matching what real browsers send. - classifyMutationErrors inspects the response `errors[]` envelope and treats idempotent failures ("already followed", etc.) as success, so a re-run after a partial batch does not log false failures. Maps "rate limit" / "spam protection" to RateLimitError, "cannot find specified user" to NotFoundError, suspended to APIError. api/media.go — DownloadTweetMedia - Streams every media asset on a tweet to disk via the same http.Client (so any future TLS impersonation round-tripper applies). - Atomic write via temp + rename so a crash mid-download cannot corrupt a previously valid file. - applyImageQuality appends the X CDN size hint (?name=large by default; small|medium|large|orig). cmd/* — real commands replacing v0.1 stubs (cmd/stubs.go deleted) cmd/tweets.go tweets list <user> [-n --replies], tweets get <id> cmd/relationships.go followers, following (shared paginated runner) cmd/search.go search posts (--product --since --until --from --to --lang --filter --exclude --min-likes --min-retweets), search users cmd/thread.go thread unroll [--all-authors] cmd/media.go media download <id|url> [-o --quality] cmd/monitor.go monitor account [-i --once] poll loop with new- tweet streaming and follower delta; interval is clamped to a minimum of 15s cmd/grow.go grow follow-engagers <tweet>, grow follow-by-keyword <query>; dry-run by default, --apply required to mutate; refuses to mutate from a cloud ASN unless --i-know-its-a-cloud-ip is also passed (now wired to cmd/doctor.go's EgressIsCloud — was a stub in the first draft); --max --min-followers; mergeUserCandidates dedups, sorts into log-buckets by follower count, then shuffles within each bucket so the action pattern is not deterministically biased toward the same whales every run. cmd/doctor.go Exports EgressIsCloud(ctx) for grow's mutation gate. internal/cmdutil/render.go PrintJSON, TabPrinter, HumanCount (1.2k / 3.4M), RelTime, TruncateRunes (rune-safe so multi-byte text doesn't get corrupted), SingleLine. Tests api/tweets_test.go rich UserTweets fixture covering Pinned + wrapped + tombstone + quote + video bitrate; full UserTweets pipeline against a fake server with two pages; ParseTweet depth-cap regression. api/relationships_test.go full Followers pipeline with cross-page dedup + UserUnavailable filter. api/search_test.go BuildAdvancedQuery cases, parseSearchUserInstructions. api/actions_test.go FollowUser form body shape, classifyMutationErrors dispatch (idempotent / rate-limit / not-found / unknown). api/media_test.go image+video download against a fake server, applyImageQuality, extFromURL. api/profile_test.go getInt now accepts string-encoded ints. Coverage: api/ 70.8%, internal/store/ 62.8%. Race-clean under go test -race -count=1 ./... Reviewed by an internal Linus-style pass; the four mandatory fixes (asnIsCloud wired to real doctor egress check, ParseTweet recursion depth cap, dead sentinel + joinArgs reimpl removed, resolveUserID cached per Client) are all in this commit. Docs skills/x-cli/SKILL.md updated to match real flag names docs/comparison-xactions.md §6 marked features Implemented with file references; new bullet for media chunked uploader = Drop (out of scope for v0.1).
1 parent efada90 commit be2e1c7

27 files changed

Lines changed: 3811 additions & 114 deletions

Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
BINARY := x
2-
PKG := github.com/lroolle/x-cli
2+
PKG := github.com/thevibeworks/x-cli
33
VERSION := $(shell git describe --tags --always --dirty 2>/dev/null || echo dev)
44
LDFLAGS := -s -w -X $(PKG)/internal/version.Version=$(VERSION)
55
GOFLAGS := -trimpath

api/actions.go

Lines changed: 133 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,133 @@
1+
package api
2+
3+
// Mutating actions: follow / unfollow. Mirrors the GraphQL/REST mutation
4+
// helpers in XActions' src/scrapers/twitter/http/engagement.js, with one
5+
// important addition: x-cli inspects the response body for X's "errors"
6+
// envelope and treats idempotent failures (already following, etc.) as
7+
// success. Without that, a retry after a partial success looks like a
8+
// failure and the caller can't tell the difference.
9+
10+
import (
11+
"context"
12+
"encoding/json"
13+
"fmt"
14+
"net/url"
15+
"strings"
16+
)
17+
18+
// FollowUser follows a user by their numeric `rest_id`. Routes through
19+
// the REST `friendshipsCreate` endpoint, which is what real browsers use
20+
// for follow actions. Throttle-aware via Client.REST.
21+
func (c *Client) FollowUser(ctx context.Context, userID string) error {
22+
if userID == "" {
23+
return &APIError{Endpoint: "friendshipsCreate", Status: 0, Body: "empty user id"}
24+
}
25+
form := url.Values{}
26+
form.Set("user_id", userID)
27+
form.Set("include_profile_interstitial_type", "1")
28+
form.Set("skip_status", "true")
29+
return c.restMutationCheckErrors(ctx, "friendshipsCreate", form)
30+
}
31+
32+
// UnfollowUser unfollows a user by their numeric `rest_id`.
33+
func (c *Client) UnfollowUser(ctx context.Context, userID string) error {
34+
if userID == "" {
35+
return &APIError{Endpoint: "friendshipsDestroy", Status: 0, Body: "empty user id"}
36+
}
37+
form := url.Values{}
38+
form.Set("user_id", userID)
39+
form.Set("include_profile_interstitial_type", "1")
40+
form.Set("skip_status", "true")
41+
return c.restMutationCheckErrors(ctx, "friendshipsDestroy", form)
42+
}
43+
44+
// FollowByUsername resolves a screen name to a user ID and follows them.
45+
func (c *Client) FollowByUsername(ctx context.Context, screenName string) error {
46+
uid, err := c.resolveUserID(ctx, strings.TrimPrefix(screenName, "@"))
47+
if err != nil {
48+
return err
49+
}
50+
return c.FollowUser(ctx, uid)
51+
}
52+
53+
// restMutationCheckErrors wraps Client.REST and inspects the decoded body
54+
// for X's `errors[]` envelope. The envelope dispatch is:
55+
//
56+
// - "already following" / "you have already" / variants → silent success
57+
// - "rate limit" / "to protect our users from spam" → RateLimitError
58+
// - "cannot find specified user" / "user not found" → NotFoundError
59+
// - "suspended" → APIError (final)
60+
// - anything else → APIError
61+
//
62+
// Throttle accounting and exponential backoff are handled inside Client.REST.
63+
// This wrapper only handles the message-level dispatch.
64+
func (c *Client) restMutationCheckErrors(ctx context.Context, endpointName string, form url.Values) error {
65+
var body map[string]any
66+
if err := c.REST(ctx, endpointName, form, &body); err != nil {
67+
return err
68+
}
69+
return classifyMutationErrors(endpointName, body)
70+
}
71+
72+
// classifyMutationErrors dispatches the `errors[]` envelope. Exported as
73+
// an internal helper for use by tests and future GraphQL mutation paths.
74+
func classifyMutationErrors(endpointName string, body map[string]any) error {
75+
errs := getSlice(body, "errors")
76+
if len(errs) == 0 {
77+
return nil
78+
}
79+
for _, e := range errs {
80+
em, ok := e.(map[string]any)
81+
if !ok {
82+
continue
83+
}
84+
msg := strings.ToLower(getString(em, "message"))
85+
86+
// Idempotent — already in the desired state.
87+
if strings.Contains(msg, "already favorited") ||
88+
strings.Contains(msg, "already retweeted") ||
89+
strings.Contains(msg, "already bookmarked") ||
90+
strings.Contains(msg, "you have already") ||
91+
strings.Contains(msg, "already followed") ||
92+
strings.Contains(msg, "already following") ||
93+
strings.Contains(msg, "not found in list of retweets") {
94+
return nil
95+
}
96+
97+
// Rate limit (server-side message variant — Throttle.Observe handles
98+
// HTTP 429 separately).
99+
if strings.Contains(msg, "rate limit") ||
100+
strings.Contains(msg, "to protect our users from spam") ||
101+
strings.Contains(msg, "too many requests") {
102+
return &RateLimitError{Endpoint: endpointName}
103+
}
104+
105+
// Not found.
106+
if strings.Contains(msg, "cannot find specified user") ||
107+
strings.Contains(msg, "user not found") ||
108+
strings.Contains(msg, "user has been suspended") {
109+
return &NotFoundError{Endpoint: endpointName}
110+
}
111+
112+
// Suspended (the actor — i.e. our own session).
113+
if strings.Contains(msg, "your account is suspended") ||
114+
strings.Contains(msg, "this account is suspended") {
115+
return &APIError{Endpoint: endpointName, Status: 0, Body: getString(em, "message")}
116+
}
117+
}
118+
119+
// Unrecognized error envelope — surface the raw messages.
120+
parts := make([]string, 0, len(errs))
121+
for _, e := range errs {
122+
if em, ok := e.(map[string]any); ok {
123+
if msg := getString(em, "message"); msg != "" {
124+
parts = append(parts, msg)
125+
}
126+
}
127+
}
128+
if len(parts) == 0 {
129+
raw, _ := json.Marshal(body)
130+
return &APIError{Endpoint: endpointName, Status: 0, Body: string(raw)}
131+
}
132+
return &APIError{Endpoint: endpointName, Status: 0, Body: fmt.Sprintf("graphql errors: %s", strings.Join(parts, "; "))}
133+
}

api/actions_test.go

Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
package api
2+
3+
import (
4+
"context"
5+
"errors"
6+
"io"
7+
"net/http"
8+
"net/http/httptest"
9+
"strings"
10+
"testing"
11+
)
12+
13+
func TestFollowUserSendsCorrectFormBody(t *testing.T) {
14+
var captured map[string]string
15+
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
16+
body, _ := io.ReadAll(r.Body)
17+
captured = map[string]string{
18+
"path": r.URL.Path,
19+
"method": r.Method,
20+
"body": string(body),
21+
"ctype": r.Header.Get("Content-Type"),
22+
}
23+
w.Header().Set("Content-Type", "application/json")
24+
w.Write([]byte(`{"id": 1, "name": "test"}`))
25+
}))
26+
defer srv.Close()
27+
28+
eps := &EndpointMap{
29+
Bases: Bases{REST: srv.URL, GraphQL: srv.URL},
30+
Bearer: "B",
31+
REST: map[string]RESTEndpoint{
32+
"friendshipsCreate": {
33+
Path: "/1.1/friendships/create.json",
34+
Method: "POST",
35+
Kind: "mutation",
36+
MinGap: 10 * 1000 * 1000, // 10ms
37+
MaxGap: 10 * 1000 * 1000,
38+
DailyCap: 10,
39+
},
40+
},
41+
}
42+
c := New(Options{
43+
Endpoints: eps,
44+
Throttle: NewThrottle(Defaults{}),
45+
Session: Session{Cookies: map[string]string{"auth_token": "x", "ct0": "y"}},
46+
})
47+
48+
if err := c.FollowUser(context.Background(), "12345"); err != nil {
49+
t.Fatal(err)
50+
}
51+
52+
if captured["path"] != "/1.1/friendships/create.json" {
53+
t.Errorf("path = %q", captured["path"])
54+
}
55+
if captured["method"] != "POST" {
56+
t.Errorf("method = %q", captured["method"])
57+
}
58+
if !strings.HasPrefix(captured["ctype"], "application/x-www-form-urlencoded") {
59+
t.Errorf("content-type = %q", captured["ctype"])
60+
}
61+
for _, want := range []string{"user_id=12345", "skip_status=true", "include_profile_interstitial_type=1"} {
62+
if !strings.Contains(captured["body"], want) {
63+
t.Errorf("body %q missing %q", captured["body"], want)
64+
}
65+
}
66+
}
67+
68+
func TestFollowUserEmptyIDReturnsError(t *testing.T) {
69+
c := &Client{}
70+
if err := c.FollowUser(context.Background(), ""); err == nil {
71+
t.Error("expected error for empty user id")
72+
}
73+
}
74+
75+
func TestClassifyMutationErrors(t *testing.T) {
76+
cases := []struct {
77+
name string
78+
body map[string]any
79+
wantErr error
80+
}{
81+
{
82+
name: "no errors",
83+
body: map[string]any{},
84+
wantErr: nil,
85+
},
86+
{
87+
name: "already following",
88+
body: map[string]any{
89+
"errors": []any{
90+
map[string]any{"message": "You have already followed this user."},
91+
},
92+
},
93+
wantErr: nil, // idempotent
94+
},
95+
{
96+
name: "rate limited",
97+
body: map[string]any{
98+
"errors": []any{
99+
map[string]any{"message": "Rate limit exceeded"},
100+
},
101+
},
102+
wantErr: &RateLimitError{},
103+
},
104+
{
105+
name: "user not found",
106+
body: map[string]any{
107+
"errors": []any{
108+
map[string]any{"message": "Cannot find specified user"},
109+
},
110+
},
111+
wantErr: &NotFoundError{},
112+
},
113+
{
114+
name: "spam protection",
115+
body: map[string]any{
116+
"errors": []any{
117+
map[string]any{"message": "To protect our users from spam"},
118+
},
119+
},
120+
wantErr: &RateLimitError{},
121+
},
122+
{
123+
name: "unknown error",
124+
body: map[string]any{
125+
"errors": []any{
126+
map[string]any{"message": "Some unrecognised error"},
127+
},
128+
},
129+
wantErr: &APIError{},
130+
},
131+
}
132+
for _, tc := range cases {
133+
t.Run(tc.name, func(t *testing.T) {
134+
err := classifyMutationErrors("test", tc.body)
135+
if tc.wantErr == nil {
136+
if err != nil {
137+
t.Errorf("want nil, got %T: %v", err, err)
138+
}
139+
return
140+
}
141+
if err == nil {
142+
t.Fatalf("want %T, got nil", tc.wantErr)
143+
}
144+
switch tc.wantErr.(type) {
145+
case *RateLimitError:
146+
var target *RateLimitError
147+
if !errors.As(err, &target) {
148+
t.Errorf("want *RateLimitError, got %T: %v", err, err)
149+
}
150+
case *NotFoundError:
151+
var target *NotFoundError
152+
if !errors.As(err, &target) {
153+
t.Errorf("want *NotFoundError, got %T: %v", err, err)
154+
}
155+
case *APIError:
156+
var target *APIError
157+
if !errors.As(err, &target) {
158+
t.Errorf("want *APIError, got %T: %v", err, err)
159+
}
160+
}
161+
})
162+
}
163+
}

api/client.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,8 @@ type Client struct {
3535
sessionMu sync.RWMutex
3636
session Session
3737

38+
userIDCache sync.Map // screen_name(lowercased) → rest_id
39+
3840
userAgent string
3941
retryBackoff time.Duration // base unit for exponential backoff; overridable in tests
4042
}

0 commit comments

Comments
 (0)