WaterSmart Utility Portal -> Home Assistant via MQTT (Austin Water + others)

WaterSmart Utility Portal → Home Assistant via MQTT (Austin Water + others)

If your water utility uses the WaterSmart portal (common across the US), you can pull hourly water usage data directly into Home Assistant using a Python script and MQTT — no hardware required.

I built this for Austin Water (austintx.watersmart.com) but the approach should work for any WaterSmart-powered utility with minor adjustments.


What you get

Three sensors auto-discovered in Home Assistant under a single “Austin Water” device:

  • Austin Water Current Hour — gallons used in the most recent hourly reading
  • Austin Water Today Total — running total for today (add this to your Energy dashboard)
  • Austin Water Leak Gallons — leak gallons flagged by the meter

How it works

WaterSmart’s usage graphs are powered by an internal JSON API at:

GET /index.php/rest/v1/Chart/RealTimeChart

This returns years of hourly readings. The script polls it every hour, parses the latest values, and publishes them to MQTT. Home Assistant picks them up automatically via MQTT discovery.

I found this endpoint by opening DevTools (F12) → Network tab → filtering by Fetch/XHR while the usage page loaded. If you’re on a different WaterSmart utility, your endpoint path may differ slightly — the script’s comments explain how to find it.


Authentication

WaterSmart uses two cookies:

  • auth_session — a long-lived “trusted device” token set after you complete email 2FA. You paste this in once and it lasts months.
  • PHPSESSID — a short-lived session token. The script handles this automatically by logging in with your credentials on startup and re-logging in whenever the session expires.

The key insight is that sending auth_session with the login POST tells the site it’s a known trusted device, so it never triggers a 2FA email challenge even on new sessions.


Setup

1. Get your auth_session cookie

Log into your WaterSmart portal in your browser and complete any 2FA email verification. Then:

  • Open DevTools (F12)
  • Go to Application tab → Cookies → your portal domain
  • Copy the value of the auth_session cookie

2. Configure the script

Edit these values at the top of austin_water.py:

WATERSMART_BASE     = "https://austintx.watersmart.com"  # Your portal URL
WATERSMART_EMAIL    = "[email protected]"
WATERSMART_PASSWORD = "yourpassword"
AUTH_SESSION        = "paste-your-auth-session-cookie-here"
MQTT_HOST           = "localhost"  # or "mosquitto" if using docker-compose
MQTT_PORT           = 1883
MQTT_USER           = ""           # if your broker requires auth
MQTT_PASSWORD       = ""

3. Deploy

Option A — Run directly:

pip install requests paho-mqtt
python austin_water.py

Option B — Docker (add to existing docker-compose.yml):

  austin_water:
    image: python:3.12-slim
    container_name: austin_water_bridge
    restart: unless-stopped
    depends_on:
      - mosquitto
    volumes:
      - /path/to/austin_water.py:/app/austin_water.py:ro
    working_dir: /app
    command: >
      sh -c "pip install requests paho-mqtt --quiet &&
             python austin_water.py"
    environment:
      - TZ=America/Chicago

4. Add to Home Assistant Energy dashboard

Go to Settings → Energy → Water → Add Water Source and select Austin Water Today Total.


Maintenance

The only ongoing maintenance is refreshing the AUTH_SESSION cookie value in the script if you start receiving 2FA verification emails again. This should happen infrequently — months apart at minimum. The script handles everything else automatically.


Troubleshooting

Symptom Fix
“Login failed — no PHPSESSID” Check email/password are correct
Repeated “Session expired” in logs AUTH_SESSION cookie has expired — grab a fresh one from DevTools
No sensors in HA Confirm MQTT integration is enabled with discovery on
Getting 2FA emails AUTH_SESSION has expired — refresh it

Notes & limitations

  • Data is hourly, not real-time — WaterSmart only updates readings according to their schedule so the “hourly” data might only update every 12 hours.
  • This uses an undocumented internal API that could break if WaterSmart updates their site
  • Tested on Austin Water. Other WaterSmart utilities will need the portal URL updated and possibly the API endpoint path — the script comments explain how to find it with DevTools
  • Your credentials are stored in plaintext in the script — store it somewhere with appropriate permissions

The script

#!/usr/bin/env python3
"""
WaterSmart -> Home Assistant MQTT Bridge
========================================
Fetches hourly water usage from a WaterSmart utility portal and publishes
it to MQTT for Home Assistant auto-discovery.

Tested with: Austin Water (austintx.watersmart.com)
May work with other WaterSmart-powered utilities — see WATERSMART_BASE below.

HOW IT WORKS
------------
On each poll the script:
  1. Fetches all available hourly readings from the WaterSmart API
  2. Finds readings newer than the last one it processed (tracked in STATE_FILE)
  3. Adds new gallons to a running cumulative total (also stored in STATE_FILE)
  4. Publishes the cumulative total to MQTT with state_class: total_increasing

Home Assistant derives accurate hourly/daily/monthly breakdowns from the
ever-increasing cumulative value. Even though WaterSmart delivers data hours
late, HA will correctly place each reading in the right hour on the Energy
dashboard because the cumulative value steps up at the right pace.

SETUP INSTRUCTIONS
------------------
1. Log into your WaterSmart portal in Chrome/Brave/Edge
2. Complete any 2FA email verification if prompted
3. Open DevTools (F12) -> Application tab -> Cookies -> your portal domain
4. Copy the value of the "auth_session" cookie and paste it below as AUTH_SESSION
5. Fill in your email, password, and MQTT broker details below
6. Run the script or deploy via Docker (see docker-compose.yml)

AUTHENTICATION
--------------
  - auth_session: Long-lived "trusted device" token set after 2FA completion.
                  Paste this in manually once. Refresh it if 2FA emails return.
  - PHPSESSID:    Short-lived session token. Obtained automatically by the script.

STATE FILE
----------
The script stores two values in STATE_FILE (a small JSON file):
  - last_timestamp: Unix timestamp of the newest reading already processed.
                    Prevents double-counting readings across polls.
  - cumulative_gal: Running total gallons since the script first ran.
                    This is what gets published to MQTT and HA.

If you delete STATE_FILE the script starts fresh from zero — HA will see the
cumulative total reset, which may confuse the Energy dashboard. Don't delete it
unless you intend to reset everything.

SENSORS CREATED IN HOME ASSISTANT
-----------------------------------
  - Austin Water Cumulative (gal, state_class: total_increasing) <- Energy dashboard
  - Austin Water Current Hour (gal, state_class: measurement)    <- optional display

DISCLAIMER
----------
This script uses an undocumented internal API. It may break if WaterSmart
updates their website. Use at your own risk.
"""

import json
import logging
import os
import time
from datetime import datetime
import requests
import paho.mqtt.client as mqtt

# ─── CONFIGURATION ────────────────────────────────────────────────────────────

# Your WaterSmart portal base URL
WATERSMART_BASE = "https://austintx.watersmart.com"

# API endpoint for hourly usage data (found via DevTools Network tab)
WATERSMART_URL  = f"{WATERSMART_BASE}/index.php/rest/v1/Chart/RealTimeChart"
LOGIN_URL       = f"{WATERSMART_BASE}/index.php/logout/login?forceEmail=1"

# Your WaterSmart login credentials
WATERSMART_EMAIL    = ""
WATERSMART_PASSWORD = ""

# Long-lived trusted device cookie — obtained from DevTools after completing 2FA.
# To find it: DevTools (F12) -> Application -> Cookies -> your portal domain -> auth_session
# Refresh this value if you start receiving 2FA verification emails again.
AUTH_SESSION = "YOUR_AUTH_SESSION_COOKIE_HERE"

# MQTT broker settings
MQTT_HOST     = "mosquitto"   # Use "mosquitto" if in docker-compose, otherwise IP/hostname
MQTT_PORT     = 1883
MQTT_USER     = ""            # Leave empty if broker requires no authentication
MQTT_PASSWORD = ""            # Leave empty if broker requires no authentication

# Path to state file — stores cumulative total and last processed timestamp.
# Must be in a volume-mounted directory so it persists across container restarts.
STATE_FILE = "/data/austin_water_state.json"

# How often to poll in seconds. WaterSmart data is typically 4-20 hours delayed
# so polling more frequently than every 4 hours yields no benefit.
POLL_INTERVAL = 14400  # 4 hours

# ─── MQTT TOPICS ──────────────────────────────────────────────────────────────

TOPIC_PREFIX          = "homeassistant/sensor/austin_water"
TOPIC_CUMULATIVE_CFG  = f"{TOPIC_PREFIX}_cumulative/config"
TOPIC_CUMULATIVE_STATE= f"{TOPIC_PREFIX}_cumulative/state"
TOPIC_CURRENT_CFG     = f"{TOPIC_PREFIX}_current/config"
TOPIC_CURRENT_STATE   = f"{TOPIC_PREFIX}_current/state"

# ─── LOGGING ──────────────────────────────────────────────────────────────────

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s [%(levelname)s] %(message)s"
)
log = logging.getLogger(__name__)

# ─── HOME ASSISTANT DISCOVERY PAYLOADS ────────────────────────────────────────

DISCOVERY_PAYLOADS = [
    (
        TOPIC_CUMULATIVE_CFG,
        {
            "name":                "Austin Water Cumulative",
            "unique_id":           "austin_water_cumulative",
            "state_topic":         TOPIC_CUMULATIVE_STATE,
            "unit_of_measurement": "gal",
            "device_class":        "water",
            # total_increasing tells HA this is an odometer-style value.
            # HA uses changes in this value to calculate hourly/daily usage.
            "state_class":         "total_increasing",
            "icon":                "mdi:water-pump",
            "device": {
                "identifiers":  ["austin_water"],
                "name":         "Austin Water",
                "manufacturer": "Austin Water / WaterSmart",
            },
        }
    ),
    (
        TOPIC_CURRENT_CFG,
        {
            "name":                "Austin Water Current Hour",
            "unique_id":           "austin_water_current_hour",
            "state_topic":         TOPIC_CURRENT_STATE,
            "unit_of_measurement": "gal",
            "device_class":        "water",
            "state_class":         "measurement",
            "icon":                "mdi:water",
            "device": {
                "identifiers":  ["austin_water"],
                "name":         "Austin Water",
                "manufacturer": "Austin Water / WaterSmart",
            },
        }
    ),
]

# ─── STATE MANAGEMENT ─────────────────────────────────────────────────────────

def load_state():
    """
    Load persisted state from STATE_FILE.
    Returns dict with keys: last_timestamp (int), cumulative_gal (float).
    Returns defaults if file doesn't exist (first run).
    """
    if os.path.exists(STATE_FILE):
        try:
            with open(STATE_FILE, "r") as f:
                state = json.load(f)
                log.info(
                    f"Loaded state: cumulative={state['cumulative_gal']:.1f} gal, "
                    f"last_timestamp={datetime.fromtimestamp(state['last_timestamp'])}"
                )
                return state
        except (json.JSONDecodeError, KeyError) as e:
            log.warning(f"Could not read state file, starting fresh: {e}")

    log.info("No state file found — starting fresh from zero")
    return {"last_timestamp": 0, "cumulative_gal": 0.0}


def save_state(state):
    """Persist state to STATE_FILE."""
    os.makedirs(os.path.dirname(STATE_FILE), exist_ok=True)
    with open(STATE_FILE, "w") as f:
        json.dump(state, f, indent=2)


# ─── SESSION MANAGEMENT ───────────────────────────────────────────────────────

def create_session():
    """
    Create a requests session with browser-like headers and the trusted device
    cookie (auth_session) pre-loaded so the site never triggers 2FA challenges.
    """
    session = requests.Session()
    session.headers.update({
        "user-agent":       "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36",
        "accept":           "*/*",
        "accept-encoding":  "gzip, deflate, br",
        "accept-language":  "en-US,en;q=0.6",
        "x-requested-with": "XMLHttpRequest",
        "referer":          f"{WATERSMART_BASE}/index.php/trackUsage",
    })
    session.cookies.set("auth_session", AUTH_SESSION, domain=WATERSMART_BASE.replace("https://", ""))
    return session


def login(session):
    """
    Log in to WaterSmart and return True if successful.
    The auth_session cookie is already present so the site skips 2FA.
    """
    log.info("Attempting login to WaterSmart...")

    try:
        session.get(f"{WATERSMART_BASE}/index.php/logout/login", timeout=30)
    except requests.RequestException as e:
        log.warning(f"Could not pre-fetch login page (continuing anyway): {e}")

    payload = {
        "token":    "",
        "email":    WATERSMART_EMAIL,
        "password": WATERSMART_PASSWORD,
    }

    try:
        resp = session.post(LOGIN_URL, data=payload, timeout=30, allow_redirects=True)
        resp.raise_for_status()
    except requests.RequestException as e:
        log.error(f"Login request failed: {e}")
        return False

    if "PHPSESSID" in session.cookies:
        log.info(f"Login successful. PHPSESSID: {session.cookies['PHPSESSID'][:8]}...")
        return True
    else:
        log.error("Login failed — no PHPSESSID in response. Check credentials or refresh AUTH_SESSION.")
        return False


# ─── DATA FETCHING ────────────────────────────────────────────────────────────

def fetch_data(session):
    """
    Fetch hourly usage data from the WaterSmart API.
    Returns (data, error_type) where error_type is None on success,
    'reauth' if the session expired, or 'error' on other failures.
    """
    try:
        resp = session.get(WATERSMART_URL, timeout=30)

        if "login" in resp.url or resp.status_code in (401, 403):
            log.warning("Session expired, will re-authenticate...")
            return None, "reauth"

        resp.raise_for_status()
        return resp.json(), None

    except requests.RequestException as e:
        log.error(f"Failed to fetch WaterSmart data: {e}")
        return None, "error"


def process_new_readings(raw, state):
    """
    Find readings newer than state['last_timestamp'], add their gallons to
    the cumulative total, and update state. Returns updated state and the
    most recent reading's gallons (for the current hour sensor).
    """
    try:
        series = raw["data"]["series"]
    except (KeyError, TypeError) as e:
        log.error(f"Unexpected data structure: {e}")
        return state, None

    if not series:
        log.warning("Series data is empty")
        return state, None

    # Filter to only readings newer than last processed timestamp
    new_readings = [
        r for r in series
        if r["read_datetime"] > state["last_timestamp"]
        and (r.get("gallons") or 0) > 0
    ]

    if not new_readings:
        log.info("No new readings since last poll")
        # Still return current hour value for that sensor
        latest = max(series, key=lambda x: x["read_datetime"])
        return state, latest.get("gallons", 0) or 0

    # Sort by timestamp
    new_readings.sort(key=lambda x: x["read_datetime"])

    # Sum new gallons and update cumulative total
    new_gallons = sum(r.get("gallons", 0) or 0 for r in new_readings)
    state["cumulative_gal"] = round(state["cumulative_gal"] + new_gallons, 1)
    state["last_timestamp"] = new_readings[-1]["read_datetime"]

    log.info(
        f"Processed {len(new_readings)} new readings | "
        f"Added: {new_gallons:.1f} gal | "
        f"Cumulative: {state['cumulative_gal']:.1f} gal | "
        f"Latest timestamp: {datetime.fromtimestamp(state['last_timestamp'])}"
    )

    current_gal = new_readings[-1].get("gallons", 0) or 0
    return state, current_gal


# ─── MQTT ─────────────────────────────────────────────────────────────────────

def connect_mqtt():
    """Create and connect an MQTT client to the broker."""
    client = mqtt.Client(
        client_id="austin_water_bridge",
        callback_api_version=mqtt.CallbackAPIVersion.VERSION2
    )
    if MQTT_USER:
        client.username_pw_set(MQTT_USER, MQTT_PASSWORD)
    client.connect(MQTT_HOST, MQTT_PORT, keepalive=60)
    client.loop_start()
    return client


def publish_discovery(client):
    """Publish MQTT auto-discovery config so HA creates entities automatically."""
    for topic, payload in DISCOVERY_PAYLOADS:
        client.publish(topic, json.dumps(payload), retain=True)
        log.info(f"Published discovery: {topic}")


def publish_state(client, state, current_gal):
    """Publish current sensor values to their MQTT state topics."""
    client.publish(TOPIC_CUMULATIVE_STATE, str(state["cumulative_gal"]), retain=True)
    client.publish(TOPIC_CURRENT_STATE,    str(round(current_gal, 1)),   retain=True)
    log.info(f"Published: cumulative={state['cumulative_gal']} gal, current={current_gal} gal")


# ─── MAIN LOOP ────────────────────────────────────────────────────────────────

def main():
    log.info("Austin Water MQTT bridge starting...")

    mqtt_client = connect_mqtt()
    publish_discovery(mqtt_client)

    # Load persisted state (cumulative total + last processed timestamp)
    state = load_state()

    session = create_session()
    if not login(session):
        log.error("Initial login failed. Check credentials and AUTH_SESSION cookie. Exiting.")
        return

    while True:
        raw, error = fetch_data(session)

        if error == "reauth":
            session = create_session()
            if login(session):
                raw, error = fetch_data(session)
            else:
                log.error("Re-authentication failed. Will retry next cycle.")

        if raw:
            state, current_gal = process_new_readings(raw, state)
            if current_gal is not None:
                publish_state(mqtt_client, state, current_gal)
                save_state(state)
        elif error == "error":
            log.warning("Fetch error — will retry next cycle")

        log.info(f"Sleeping {POLL_INTERVAL}s until next poll...")
        time.sleep(POLL_INTERVAL)


if __name__ == "__main__":
    main()

If you get it working on a different WaterSmart utility, please share your portal URL and any endpoint differences so others can benefit.

If you’re interested & have the know-how, please consider contributing to GitHub - wbyoung/watersmart: WaterSmart for Home Assistant · GitHub which doesn’t yet have 2FA support. I’m sure others would really appreciate it.

Oh wow, I guess I should have done a simple search to see if this was already built! This is great! I’m not really a developer, so I wouldn’t know how to implement 2FA properly. The way I did it requires that you retrieve your auth_session from dev tools in a separate browser – not exactly something you would expect from a proper integration.

Bummer. There’s an open PR to try to get it done, but I’m not sure it’ll get completed.

I’ve linked back to here from all open issues discussing 2FA. Your thorough description of your work will hopefully help someone down the line.

Thanks!

1 Like

I just did a quick search for watersmart implementations in the US. It might be helpful to add this list to github so users can more easily find your implementation. I was aware of Austin’s smart meters but not that they use “WaterSmart” under the hood.

Texas

California

  • East Bay Municipal Utility District (EBMUD): Offers a portal for viewing daily consumption data.
  • Santa Barbara: Features the WaterSmart Online Tool for 24/7 access to hourly data and leak alerts.
  • Cotati: An early partner using the VertexOne Customer Portal for interval data.
  • Buena Park: Provides a free web portal to monitor daily water flow.
  • Oceanside: Uses WaterSmart for residential data visibility.
  • Santa Rosa & Santa Cruz: Both utilize the analytics portal for customer leak detection.

Florida

Colorado & Mountain West

  • Fort Collins: Currently modernizing with VertexOne software to provide a new customer portal.
  • Thornton: Provides the WaterSmart Portal for tracking daily and hourly usage.
  • Brighton: Partners with WaterSmart for tailored home water reports and an engagement portal.
  • Tempe, AZ: Offers the WaterSmart Customer Portal for hourly usage monitoring.
  • Bend, OR: Uses WaterSmart for daily monitoring and household comparisons.

Other Notable Portals

  • Broken Arrow, OK: Recently launched the WaterSmart portal for near real-time hourly consumption.
  • Billings, MT: Provides a Customer Self-Service Portal for analyzing historical bills and consumption.
  • Crystal Lake, IL: Uses a WaterSmart portal for real-time leak detection and usage.
  • Winston-Salem, NC: Deployed the WaterSmart analytics program for county-wide usage insights.
  • EPCOR USA: Maintains customer portals across various service areas (including AZ, NM, and TX).

Primary Portal Brands to Look For

If a utility uses any of these names, it is a VertexOne-based system:

  1. WaterSmart (Often cityname.watersmart.com)
  2. MyMeter (Commonly used by co-ops and electric utilities)
  3. VXengage (The modernized version of the WaterSmart portal)
1 Like

Thanks. I made an issue so I don’t lose track of it & will get to it eventually.

Update: Significant improvements to the WaterSmart bridge

After running the original script for a while I discovered several issues and ended up rewriting it significantly. Here's what changed and why.


What was wrong with the original

The MQTT cumulative approach didn't give hourly data. Home Assistant records the cumulative value at the time it changes (when the script polls), not at the time the water was actually used. So instead of seeing hourly bars spread through the day, you'd see one big bar every 4 hours when the script polled — not useful for the Energy dashboard.

WaterSmart timestamps are stored in local time, not UTC. The API returns Unix timestamps that represent local time values rather than true UTC. Without correcting for this, readings appeared 5-6 hours earlier than they actually occurred in HA.


What the new approach does

Instead of publishing a cumulative MQTT value, the script now injects each hourly reading directly into HA's long-term statistics database at the correct timestamp using recorder.import_statistics (provided by the Spook custom integration). This means:

  • Hourly bars in the Energy dashboard appear at the correct local time
  • Late-arriving data (WaterSmart is typically 4-20 hours delayed) gets placed in the right hour when it arrives
  • No more giant bars appearing at poll time

New prerequisites

  • Spook custom integration via HACS — provides recorder.import_statistics
  • A Home Assistant long-lived access token — Profile → Long-Lived Access Tokens → Create Token

New configuration options

Two new config values at the top of the script:

# Your utility's local timezone — corrects WaterSmart's timestamp quirk
# WaterSmart stores timestamps as if local time = UTC, so we need to know
# the local timezone to convert correctly. Handles DST automatically.
# Examples: "America/Chicago", "America/New_York", "America/Los_Angeles"
PORTAL_TIMEZONE = "America/Chicago"

# Home Assistant connection
HA_URL   = "http://your-ha-ip:8123"
HA_TOKEN = "your-long-lived-token-here"

The PORTAL_TIMEZONE setting is important — without it readings will appear 5-6 hours off in the Energy dashboard. Set it to your utility's local timezone and DST is handled automatically.


What was removed

  • Today Total sensor — it was unreliable because WaterSmart delivers data hours late, so the "today" total was always incomplete
  • Leak Gallons sensor — I found it wasn't reliable enough to be useful
  • Hourly polling — changed to every 4 hours since WaterSmart data is 4-20 hours delayed anyway

Energy dashboard setup

After deploying the new script, go to Settings → Energy → Water → Add Water Source and select Austin Water Cumulative. The hourly breakdown will populate as WaterSmart delivers data.


Updated docker-compose

The data directory mount is now required to persist the state file:

  austin_water:
    image: python:3.12-slim
    container_name: austin_water
    restart: unless-stopped
    depends_on:
      - mosquitto
    volumes:
      - /path/to/austin_water.py:/app/austin_water.py:ro
      - /path/to/data:/data
    working_dir: /app
    command: >
      sh -c "pip install requests paho-mqtt --quiet &&
             python austin_water.py"
    environment:
      - TZ=America/Chicago

Updated script

#!/usr/bin/env python3
"""
WaterSmart -> Home Assistant Statistics Bridge
===============================================
Fetches hourly water usage from a WaterSmart utility portal and injects
each reading directly into Home Assistant's long-term statistics database
at the correct timestamp using the import_statistics API via Spook.

Readings are sent in small batches to avoid HA's 32KB event data limit.

Tested with: Austin Water (austintx.watersmart.com)
May work with other WaterSmart-powered utilities with minor URL changes.

TIMESTAMP NOTE
--------------
WaterSmart stores timestamps as if the local time value is UTC — i.e. a
reading at 3am local time is stored with a Unix timestamp that equals 3am
UTC, not 3am local. To correct this, the script takes the UTC datetime
representation of each timestamp and reinterprets those numbers as local
time, then converts to true UTC for Home Assistant. Set PORTAL_TIMEZONE
to your utility's local timezone to get correct display times in HA.

SETUP INSTRUCTIONS
------------------
1. Log into your WaterSmart portal in Chrome/Brave/Edge
2. Complete any 2FA email verification if prompted
3. Open DevTools (F12) -> Application tab -> Cookies -> your portal domain
4. Copy the value of the "auth_session" cookie — paste it below as AUTH_SESSION
5. Create a HA long-lived access token:
   Profile (bottom left) -> Long-Lived Access Tokens -> Create Token
6. Install Spook via HACS and restart HA
7. Fill in all config values below and deploy

ENERGY DASHBOARD
----------------
Add "Austin Water Cumulative" as your water source in Settings -> Energy -> Water.

STATE FILE
----------
Stores last_timestamp and cumulative_gal. Do not delete this file.

DISCLAIMER
----------
Uses an undocumented WaterSmart API. May break if WaterSmart updates their site.
"""

import json
import logging
import os
import time
from datetime import datetime, timezone
from zoneinfo import ZoneInfo
import requests
import paho.mqtt.client as mqtt

# ─── CONFIGURATION ────────────────────────────────────────────────────────────

WATERSMART_BASE = "https://austintx.watersmart.com"
WATERSMART_URL  = f"{WATERSMART_BASE}/index.php/rest/v1/Chart/RealTimeChart"
LOGIN_URL       = f"{WATERSMART_BASE}/index.php/logout/login?forceEmail=1"

WATERSMART_EMAIL    = "YOUR_EMAIL_HERE"
WATERSMART_PASSWORD = "YOUR_PASSWORD_HERE"

# Long-lived trusted device cookie (from DevTools after completing 2FA)
AUTH_SESSION = "YOUR_AUTH_SESSION_COOKIE_HERE"

# Timezone of the WaterSmart portal — used to correctly interpret timestamps.
# WaterSmart stores timestamps as if local time = UTC, so we need to know the
# local timezone to convert correctly. Change this for other utilities.
# Examples: "America/Chicago", "America/New_York", "America/Los_Angeles"
PORTAL_TIMEZONE = "America/Chicago"

# Home Assistant settings
HA_URL   = "http://192.168.1.100:8123"
HA_TOKEN = ""

HA_STATISTIC_ID   = "sensor.austin_water_austin_water_cumulative"
HA_STATISTIC_NAME = "Austin Water Cumulative"

MQTT_HOST     = "mosquitto"
MQTT_PORT     = 1883
MQTT_USER     = ""
MQTT_PASSWORD = ""

STATE_FILE    = "/data/austin_water_state.json"
POLL_INTERVAL = 14400  # 4 hours
BATCH_SIZE    = 50     # readings per import_statistics call

# ─── TIMEZONE SETUP ───────────────────────────────────────────────────────────

LOCAL_TZ = ZoneInfo(PORTAL_TIMEZONE)

def watersmart_ts_to_utc(raw_ts):
    """
    Convert a WaterSmart raw Unix timestamp to a correct UTC datetime.

    WaterSmart stores timestamps as if the local time value equals UTC.
    For example, a reading at 3am local time has a Unix timestamp that
    represents 3am UTC rather than 3am local. To correct this:
      1. Convert the raw timestamp to a naive UTC datetime (gets the local
         time numbers without timezone info)
      2. Attach the local timezone to those numbers (reinterprets them as
         local time)
      3. Convert to true UTC

    This correctly handles DST transitions automatically.
    """
    naive_utc = datetime.utcfromtimestamp(raw_ts)
    local_dt  = naive_utc.replace(tzinfo=LOCAL_TZ)
    return local_dt.astimezone(timezone.utc)

# ─── MQTT TOPICS ──────────────────────────────────────────────────────────────

TOPIC_PREFIX        = "homeassistant/sensor/austin_water"
TOPIC_CURRENT_CFG   = f"{TOPIC_PREFIX}_current/config"
TOPIC_CURRENT_STATE = f"{TOPIC_PREFIX}_current/state"

# ─── LOGGING ──────────────────────────────────────────────────────────────────

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s [%(levelname)s] %(message)s"
)
log = logging.getLogger(__name__)

# ─── HOME ASSISTANT DISCOVERY ─────────────────────────────────────────────────

DISCOVERY_PAYLOADS = [
    (
        TOPIC_CURRENT_CFG,
        {
            "name":                "Austin Water Current Hour",
            "unique_id":           "austin_water_current_hour",
            "state_topic":         TOPIC_CURRENT_STATE,
            "unit_of_measurement": "gal",
            "device_class":        "water",
            "state_class":         "measurement",
            "icon":                "mdi:water",
            "device": {
                "identifiers":  ["austin_water"],
                "name":         "Austin Water",
                "manufacturer": "Austin Water / WaterSmart",
            },
        }
    ),
]

# ─── STATE MANAGEMENT ─────────────────────────────────────────────────────────

def load_state():
    if os.path.exists(STATE_FILE):
        try:
            with open(STATE_FILE, "r") as f:
                state = json.load(f)
                log.info(
                    f"Loaded state: cumulative={state['cumulative_gal']:.1f} gal, "
                    f"last_timestamp={watersmart_ts_to_utc(state['last_timestamp'])} UTC"
                )
                return state
        except (json.JSONDecodeError, KeyError) as e:
            log.warning(f"Could not read state file, starting fresh: {e}")

    log.info("No state file found — starting fresh from zero")
    return {"last_timestamp": 0, "cumulative_gal": 0.0}


def save_state(state):
    os.makedirs(os.path.dirname(STATE_FILE), exist_ok=True)
    with open(STATE_FILE, "w") as f:
        json.dump(state, f, indent=2)


# ─── SESSION MANAGEMENT ───────────────────────────────────────────────────────

def create_session():
    session = requests.Session()
    session.headers.update({
        "user-agent":       "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/146.0.0.0 Safari/537.36",
        "accept":           "*/*",
        "accept-encoding":  "gzip, deflate, br",
        "accept-language":  "en-US,en;q=0.6",
        "x-requested-with": "XMLHttpRequest",
        "referer":          f"{WATERSMART_BASE}/index.php/trackUsage",
    })
    session.cookies.set("auth_session", AUTH_SESSION, domain=WATERSMART_BASE.replace("https://", ""))
    return session


def login(session):
    log.info("Attempting login to WaterSmart...")
    try:
        session.get(f"{WATERSMART_BASE}/index.php/logout/login", timeout=30)
    except requests.RequestException as e:
        log.warning(f"Could not pre-fetch login page (continuing anyway): {e}")

    payload = {"token": "", "email": WATERSMART_EMAIL, "password": WATERSMART_PASSWORD}
    try:
        resp = session.post(LOGIN_URL, data=payload, timeout=30, allow_redirects=True)
        resp.raise_for_status()
    except requests.RequestException as e:
        log.error(f"Login request failed: {e}")
        return False

    if "PHPSESSID" in session.cookies:
        log.info(f"Login successful. PHPSESSID: {session.cookies['PHPSESSID'][:8]}...")
        return True
    else:
        log.error("Login failed — no PHPSESSID in response. Check credentials or refresh AUTH_SESSION.")
        return False


# ─── DATA FETCHING ────────────────────────────────────────────────────────────

def fetch_data(session):
    try:
        resp = session.get(WATERSMART_URL, timeout=30)
        if "login" in resp.url or resp.status_code in (401, 403):
            log.warning("Session expired, will re-authenticate...")
            return None, "reauth"
        resp.raise_for_status()
        return resp.json(), None
    except requests.RequestException as e:
        log.error(f"Failed to fetch WaterSmart data: {e}")
        return None, "error"


# ─── HOME ASSISTANT STATISTICS ────────────────────────────────────────────────

def import_batch(stats):
    """Send a single batch of statistics entries to HA."""
    payload = {
        "has_mean":            False,
        "has_sum":             True,
        "name":                HA_STATISTIC_NAME,
        "source":              "recorder",
        "statistic_id":        HA_STATISTIC_ID,
        "unit_of_measurement": "gal",
        "stats":               stats,
    }
    try:
        resp = requests.post(
            f"{HA_URL}/api/services/recorder/import_statistics",
            headers={
                "Authorization": f"Bearer {HA_TOKEN}",
                "Content-Type":  "application/json",
            },
            json=payload,
            timeout=30,
        )
        if not resp.ok:
            log.error(f"HA API error {resp.status_code}: {resp.text}")
            return False
        return True
    except requests.RequestException as e:
        log.error(f"Failed to import batch to HA: {e}")
        return False


def import_statistics(new_readings, starting_cumulative):
    """
    Inject readings into HA's long-term statistics database in small batches.
    Timestamps are corrected using watersmart_ts_to_utc() before submission.
    """
    if not new_readings:
        return True

    stats = []
    cumulative = starting_cumulative
    for reading in new_readings:
        gallons = reading.get("gallons", 0) or 0
        cumulative = round(cumulative + gallons, 1)
        ts_utc = watersmart_ts_to_utc(reading["read_datetime"])
        stats.append({
            "start": ts_utc.isoformat(),
            "sum":   cumulative,
            "state": round(gallons, 1),
        })

    total_batches = (len(stats) + BATCH_SIZE - 1) // BATCH_SIZE
    for i in range(0, len(stats), BATCH_SIZE):
        batch     = stats[i:i + BATCH_SIZE]
        batch_num = (i // BATCH_SIZE) + 1
        if not import_batch(batch):
            log.error(f"Batch {batch_num}/{total_batches} failed")
            return False
        log.info(f"Imported batch {batch_num}/{total_batches} ({len(batch)} entries)")
        time.sleep(0.5)

    log.info(f"Successfully imported all {len(stats)} statistics entries to HA")
    return True


# ─── MQTT ─────────────────────────────────────────────────────────────────────

def connect_mqtt():
    client = mqtt.Client(
        client_id="austin_water_bridge",
        callback_api_version=mqtt.CallbackAPIVersion.VERSION2
    )
    if MQTT_USER:
        client.username_pw_set(MQTT_USER, MQTT_PASSWORD)
    client.connect(MQTT_HOST, MQTT_PORT, keepalive=60)
    client.loop_start()
    return client


def publish_discovery(client):
    for topic, payload in DISCOVERY_PAYLOADS:
        client.publish(topic, json.dumps(payload), retain=True)
        log.info(f"Published discovery: {topic}")


def publish_current(client, current_gal):
    client.publish(TOPIC_CURRENT_STATE, str(round(current_gal, 1)), retain=True)
    log.info(f"Published MQTT current hour: {current_gal} gal")


# ─── MAIN LOOP ────────────────────────────────────────────────────────────────

def main():
    log.info(f"Austin Water statistics bridge starting (timezone: {PORTAL_TIMEZONE})...")

    mqtt_client = connect_mqtt()
    publish_discovery(mqtt_client)

    state = load_state()

    session = create_session()
    if not login(session):
        log.error("Initial login failed. Check credentials and AUTH_SESSION cookie. Exiting.")
        return

    while True:
        raw, error = fetch_data(session)

        if error == "reauth":
            session = create_session()
            if login(session):
                raw, error = fetch_data(session)
            else:
                log.error("Re-authentication failed. Will retry next cycle.")

        if raw:
            try:
                series = raw["data"]["series"]
            except (KeyError, TypeError) as e:
                log.error(f"Unexpected data structure: {e}")
                series = []

            new_readings = [
                r for r in series
                if r["read_datetime"] > state["last_timestamp"]
                and (r.get("gallons") or 0) > 0
            ]
            new_readings.sort(key=lambda x: x["read_datetime"])

            if new_readings:
                success = import_statistics(new_readings, state["cumulative_gal"])

                if success:
                    for r in new_readings:
                        state["cumulative_gal"] = round(
                            state["cumulative_gal"] + (r.get("gallons") or 0), 1
                        )
                    state["last_timestamp"] = new_readings[-1]["read_datetime"]

                    latest_utc = watersmart_ts_to_utc(state["last_timestamp"])
                    log.info(
                        f"Processed {len(new_readings)} new readings | "
                        f"Cumulative: {state['cumulative_gal']:.1f} gal | "
                        f"Latest: {latest_utc} (local: {latest_utc.astimezone(LOCAL_TZ)})"
                    )

                    current_gal = new_readings[-1].get("gallons", 0) or 0
                    publish_current(mqtt_client, current_gal)
                    save_state(state)
            else:
                log.info("No new readings since last poll")
                if series:
                    latest = max(series, key=lambda x: x["read_datetime"])
                    publish_current(mqtt_client, latest.get("gallons", 0) or 0)

        elif error == "error":
            log.warning("Fetch error — will retry next cycle")

        log.info(f"Sleeping {POLL_INTERVAL}s until next poll...")
        time.sleep(POLL_INTERVAL)


if __name__ == "__main__":
    main()

Happy to answer questions. If you're on a different WaterSmart utility let me know what timezone and portal URL you're using and I can help you get it configured.