Skip to content

[BUG]MaxScale go.d collector: server_state chart never sets master/slave while REST API reports "Master, Running" / "Slave, Running" #1124

@mcmlln

Description

@mcmlln

Environment

Netdata: (please fill – e.g. netdata vX.Y.Z)

go.d.plugin version: v2.8.1
From debug output:

DBG godplugin/main.go:63 plugin: name=go.d, version=v2.8.1 ...

Collector: maxscale (go.d module)

MaxScale host: srva-pri-mxs-01

MaxScale REST API: http://127.0.0.1:8989

MaxScale CLI user: bneadm

MaxScale monitor: mxs-a-mdb-mon (MariaDB monitor)

Back-end servers (from MaxScale):

mxs-a-node1 – 10.7.100.43:3306

mxs-a-node2 – 10.7.100.44:3306

mxs-a-node3 – 10.7.100.143:3306

mxs-a-node4 – 10.7.100.144:3306

Summary of the issue

The maxscale go.d collector is successfully scraping MaxScale via the REST API and creating charts such as:

maxscale_local.poll_events

maxscale_local.current_sessions

maxscale_local.server__state

maxscale_local.server__current_connections

However, in the Server State charts (maxscale.server_state), only the running dimension is ever set to 1.

The master, slave, synced, etc. dimensions are always 0 for all servers, even though the MaxScale REST API’s attributes.state field clearly contains "Master, Running" for the primary and "Slave, Running" for replicas.

So Netdata correctly reports that servers are “Running”, but never reflects their Master/Slave roles.

Configuration (go.d maxscale job)

The collector is configured against the MaxScale REST API:

/etc/netdata/go.d/maxscale.conf (or equivalent)

jobs:

  • name: local
    url: http://127.0.0.1:8989
    username: bneadm
    password: "redacted"
    timeout: 6
    update_every: 1

Debug invocation:

sudo -u netdata /usr/libexec/netdata/plugins.d/go.d.plugin -d -m maxscale

This shows the job being created and passing its initial check:

CONFIG go.d:collector:maxscale:local create accepted job ... file=/usr/lib/netdata/conf.d/go.d/maxscale.conf ...
DBG maxscale/collector.go:81 using URL http://127.0.0.1:8989 collector=maxscale job=local
DBG maxscale/collector.go:82 using timeout: 6s collector=maxscale job=local
INF module/job.go:257 check success collector=maxscale job=local
CONFIG go.d:collector:maxscale:local status running
INF module/job.go:296 started, data collection interval 1s collector=maxscale job=local

MaxScale server roles (CLI)

From maxctrl:

maxctrl --user bneadm --password '' list servers

Output:

┌─────────────┬──────────────┬──────┬─────────────┬─────────────────┬────────────┬───────────────┐
│ Server │ Address │ Port │ Connections │ State │ GTID │ Monitor │
├─────────────┼──────────────┼──────┼─────────────┼─────────────────┼────────────┼───────────────┤
│ mxs-a-node1 │ 10.7.100.43 │ 3306 │ 9 │ Master, Running │ 0-1-564777 │ mxs-a-mdb-mon │
│ mxs-a-node2 │ 10.7.100.44 │ 3306 │ 8 │ Slave, Running │ 0-1-564777 │ mxs-a-mdb-mon │
│ mxs-a-node3 │ 10.7.100.143 │ 3306 │ 6 │ Slave, Running │ 0-1-564777 │ mxs-a-mdb-mon │
│ mxs-a-node4 │ 10.7.100.144 │ 3306 │ 6 │ Slave, Running │ 0-1-564777 │ mxs-a-mdb-mon │
└─────────────┴──────────────┴──────┴─────────────┴─────────────────┴────────────┴───────────────┘

So MaxScale clearly knows there is one Master and three Slaves.

MaxScale REST API (/v1/servers)

The go.d collector is configured to talk to the REST API on 127.0.0.1:8989 and that API returns the expected state values:

curl -s -u bneadm:'redacted' http://127.0.0.1:8989/v1/servers
| jq '.data[].attributes.state'

Output:

"Master, Running"
"Slave, Running"
"Slave, Running"
"Slave, Running"

So the JSON contains the same state strings as shown by maxctrl.

Netdata output for maxscale.server_*_state

From the go.d debug output, here is an example snapshot of the maxscale_local.server_*_state metrics:

BEGIN 'maxscale_local.server_mxs-a-node1_state' 999445
SET 'master' = 0
SET 'slave' = 0
SET 'running' = 1
SET 'down' = 0
SET 'maintenance' = 0
SET 'draining' = 0
SET 'drained' = 0
SET 'relay_master' = 0
SET 'binlog_relay' = 0
SET 'synced' = 0
END

BEGIN 'maxscale_local.server_mxs-a-node2_state' 999445
SET 'master' = 0
SET 'slave' = 0
SET 'running' = 1
SET 'down' = 0
SET 'maintenance' = 0
SET 'draining' = 0
SET 'drained' = 0
SET 'relay_master' = 0
SET 'binlog_relay' = 0
SET 'synced' = 0
END

BEGIN 'maxscale_local.server_mxs-a-node3_state' 999445
SET 'master' = 0
SET 'slave' = 0
SET 'running' = 1
SET 'down' = 0
SET 'maintenance' = 0
SET 'draining' = 0
SET 'drained' = 0
SET 'relay_master' = 0
SET 'binlog_relay' = 0
SET 'synced' = 0
END

BEGIN 'maxscale_local.server_mxs-a-node4_state' 999445
SET 'master' = 0
SET 'slave' = 0
SET 'running' = 1
SET 'down' = 0
SET 'maintenance' = 0
SET 'draining' = 0
SET 'drained' = 0
SET 'relay_master' = 0
SET 'binlog_relay' = 0
SET 'synced' = 0
END

So for all four servers:

running = 1

master = 0

slave = 0

synced = 0

etc.

This persists over time; at no point do master or slave become 1, despite the REST API returning "Master, Running" / "Slave, Running".

Other charts from the same collector (e.g. current_sessions, poll_events, current_file_descriptors) are being populated correctly.

Expected vs actual behaviour

Expected:

Given that:

/v1/servers returns "Master, Running" for one server and "Slave, Running" for the others, and

The MaxScale module exposes per-server state dimensions master, slave, running, etc.

I would expect:

For the “master” server:

master = 1

running = 1

For the slave servers:

slave = 1

running = 1

so that the maxscale.server_state charts clearly distinguish the role of each backend.

Actual:

All servers always report:

running = 1

master = 0

slave = 0

There is no way, from Netdata’s charts, to tell which server is Master vs Slave, even though the REST API exposes this in the state string.

Suspicion / hypothesis

From the outside, this looks like a parsing issue or missing logic in the maxscale go.d module:

The module clearly reads the server list and populates other metrics correctly.

It does set running = 1, so it is at least parsing "Running".

It appears not to be setting master / slave based on the "Master, Running" / "Slave, Running" attributes.state strings — possibly because:

It’s not checking for "Master" / "Slave" substrings at all, or

It’s looking at a different field/path in the JSON schema than the one where state currently resides.

If it helps, I’m happy to provide the full /v1/servers JSON and additional debug logs, but the minimal reproduction is:

Configure MaxScale with a MariaDB monitor so that one server is Master, Running and others are Slave, Running.

Configure the Netdata maxscale go.d job to use the REST API at 127.0.0.1:8989.

Confirm /v1/servers returns "Master, Running" / "Slave, Running" in .data[].attributes.state.

Observe that in Netdata’s maxscale.server_state charts, only running is set to 1; master and slave remain 0 for all servers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions