I wonder how big a hosts file could get before it degraded performance?
You can still do DNS over VPN. That is, I could set up an unbound cache on a VPS and secure my traffic to that server using wireguard or openvpn. This just pushes the eyeballs out of my home and into someone else's datacenter, though.
I'm guessing that the time required to open the hosts file for reading would eclipse the amount of time required to parse it and do a linear search on the entries. I would think that the file would have to be measured in the hundreds of megabytes for performance to be a significant consideration, but I haven't benchmarked it.
This is true, but it requires deep packet inspection, and that's something which usually isn't on by default. They might enable it for specific clients under some circumstances, but I haven't heard of ISPs logging that level of detail permanently. I suppose the could run a service that just inspects DNS packets, pulls out the domains, and correlates them with each client, but I haven't heard of that being deployed in the wild (although perhaps that's changed).
Their DNS severs, however, and definitely doing this, and they sell that data to 3rd parties.
TLS SNI also leaks the domain name, but again that would require deep packet inspection to extract and correlate with the clients. Definitely possible, but probably not deployed.
DNS blocking is an affordable technology deployed in many countries. For example in the UK if you use the sort of large ISP advertised on TV it has DNS blocking.
With DNS blocking if you try to look up a "forbidden" FQDN you get back either a bogus NXDOMAIN or A records chosen by the blocker.
DoH bypasses DNS blocking pretty cheaply. DNSSEC would detect it and stop but doesn't bypass it. Tor bypasses it but at considerable cost.
The (eventually indefinitely delayed) UK government plans to institute mandatory censorship of the Internet relied on DNS blocking as their backstop. The idea was if anybody anywhere in the world didn't voluntarily agree to obey censorship rules, they'd be blocked in the UK. The government would just accept that some proportion of users would install Tor to bypass that restriction. DoH means "some proportion of users" potentially becomes "everybody with a modern browser" and that was not palatable.
DPI is available on inexpensive routers now and has been an option on Cisco/Fortigate/Palo Alto/Ubiquiti gear for ages. I have no doubt that it is heavily used at most IPSs.
There's a big different between the data being available vs. being put to use. ISPs have no incentive to put a bunch of effort into collecting data from the 1% or less of their customers which don't use their DNS servers. Thus, it's extremely unlikely they run packet inspection on every packet just so that they can collect browsing history from people who are privacy conscious.