Hacker News new | past | comments | ask | show | jobs | submit | chaps's comments login

This is terrible advice, friend.

> This is terrible advice, friend

Why? I say this as someone with personal counsel on retainer, and who has been pulled over but not gotten a ticket in a decade. I’m legally conservative but also practical.

Most people don’t have the time to be arraigned every time they might have gone five over. “Never talk to the police” means every random stop turns into interrogation. That simply isn’t the baseline risk for most of us.


And I'm saying that as an award winning investigative journalist who focuses on transparency and police misconduct. This is the first time in years where I only have one lawsuit against a police agency. I also have counsel on retainer :)

I believe your advice is bad because it's being given to a wide audience that very likely doesn't know better.


It's not terrible advice, it's the same advice you will hear from many lawyers. If a cop pulls you over because you were blatantly speeding - saying nothing, or admitting to it and apologising, are both reasonable things to do. Never talking to police is a safer blanket decision, but you can have some grey area.

Worked for a company whose automated options trading system was an.... excel spreadsheet. Every now and then I would need to RDP to a machine to RDP into another machine to restart it. Pain. So much pain.

Cheeky response..

It would take a 1 year project and 5 devs to replace what one finance person created in a day.

Then when it didnt work complain the SME got the requirements wrong.

Im being silly but as a dev i hate the hate for excel, though i know from years of experience that its also a nightmare.


I think it's important to think of Excel as a tool for modelling reality and not a tool for changing it. IMO Excel should not be producing data feeds that other tools expect real time access to, nor should it make API calls that mutate state on other platforms.

At that point why not code something using the Interop APIs to programatically interface with the spreadsheet? It's a PITA to code but it works.

I'm sure they were glad they had you to fix their problems!

What year?

Erm, "Assigned" in this context is not new: https://law.justia.com/cases/federal/appellate-courts/ca5/17...

"More simply, a hash value is a string of characters obtained by processing the contents of a given computer file and assigning a sequence of numbers and letters that correspond to the file’s contents."

From 2018 in United States v. Reddick.


The calculation is what assigns the value.


No. The calculation is what determines what the assignation should be. It does not actually assign anything.

This FOIA litigation by ACLU v ICE goes into this topic quite a lot: https://caselaw.findlaw.com/court/us-2nd-circuit/2185910.htm...


Yes, Google's calculation.


Did Google invent this hash?


Why is that relevant? Google used a hashing function to persist a new record within a database. They created a record for this.

Like I said in a sib. comment, this FOIA lawsuit goes into questions of hashing pretty well: https://caselaw.findlaw.com/court/us-2nd-circuit/2185910.htm...


I've taken this trip a couple times! It's a genuinely wonderful experience. Every Amtrak trip I take has some memorable experience I hope to never forget.

Tip: get the sleeping car and eat every meal they offer, especially if you travel by yourself. You'll be sat with random folk who almost always have interesting stories to share.


I've done that a couple of times and completely agreed. One time I sat in the dining car right next to a retired realtor who told me a whole lot about real estate in Los Angeles. (Unfortunately I forgot most of it because at that time I was a renter and haven't bought any properties yet.) Another time I was sitting next to a frail-looking grandma assisted by her granddaughter. The grandma was very talkative and told me a lot of memories from the AIDS epidemic in the 1980s.


Agreed, and the food is surprisingly good in my experience. It's not like the stuff you get from the cafe car on a normal Amtrak ride.


City of Chicago did some strange geofenced text messages to folk in westside Chicago to get people to go inside during the early COVID days.

From leaked emails:

    Hey folks, We have a situation on Westside neighborhoods (specifically CPD 11th District) where folks between the ages of 16-25 are congregating outside in groups and not heeding the shelter in place message. Mayor would like to know if we can do a targeted texting in that geography to spread the following messages:
    [...]
    3. CPD will do a verbal warning but if you repeatedly disregard the warning, CPD will issue citations and/or arrest.
    4. By not following these directives, you are putting yourself at risk but also your family members, particularly those who live with you who are elderly or sick. Not sure who is in charge but I think I have included all relevant people here. If not, please add. Can you tell me if such a geo-coded texting is possible and when we might be able to put it out? We probably need to do it on a regular basis for the message to sink in. Let everyone on this chain know.
==========

    I'm sorry, but WEA is not intended for that type of usage. It is supposed to be used in dire emergencies only. People have the ability to opt out of messages at any time. If we inundate them with messages they do not find useful, they will opt out and won't be alerted the next time we have an Active Shooter Incident, Tornado Warning, Ordered Evacuation, Amber Alert, or some other extreme situation.
==========

    Anna and I spoke. CPD believes Saturday at 5 pm would be a good time to send out the next one. Perhaps once a week but we will monitor the dispersal orders to see if this is a continued need. Thank you for flexibility.

https://www.documentcloud.org/documents/20652293-re_-geocodi...


Oh my gosh these government workers. What were they thinking? "Hey guys! If minorities don't fear COVID then we'll abuse the emergency alert system to threaten them with something they do fear: cops. Just be extra sure it's geofenced to the economically disadvantaged sectors."


Many years ago I worked at a company that had to print out every transaction with a dot-matrix printer every evening. It was my group's job to do very minimal maintenance on it like adding toner/paper and such. When filling the toner though, you had to screw the toner container onto a special.... gyrating.... setup just to make sure all toner came out. A thing that struck me about it was how uncannily, uh, sexual, it was in its gyrations.

Does anybody have any idea what model of printer this might have been? I'd love to see a video of it again.


Can you expand on what you mean, exactly?


They thought they had a better advertising chance for `uv` than they actually did. As far as I can tell, `uv` is a replacement for `pip`. But the project linked by OP doesn't actually replace `pip`, but rather a small subset of functionality in `pip` - `pip freeze`. Unless `uv` has some sort of import scanning functionality, the suggestion to use `uv` instead doesn't really make any sense.


On the contrary, read this about `uv` by the author of `rye`:

https://lucumr.pocoo.org/2024/8/21/harvest-season/

Domination is a goal because it means that most investment will go into one stack. I can only re-iterate my wish and desire that Rye (and with it a lot of other tools in the space) should cease to exist once the dominating tool has been established. For me uv is poised to be that tool. It's not quite there today yet for all cases, but it will be in no time, and now is the moment to step up as a community and start to start to rally around it.

Yes, I linked to an obscure feature uv supports. It's already a rather lot more.


Again, you seem to be arguing against points no one is making. uv is rad, you don't have to convince me. But people are still going to use pip for awhile and if they do, the repo in OP is helpful. You need to apply your zealotry to relevant situations for it to be effective.

You are talking past people. It comes across like you're not reading comments with sincerity but rather as an empty vessel to attach your own personal opinion as a rebuttal.

For instance, the rye authors views (no one mentioned rye btw) have little bearing on how uv helps in this particular instance.


I do work with "open data" on a near-obsessive basis and -- friends, please do not trust "open data" portals to reflect reality accurately. The datasets are often curated, categories changed during the ETL processes, rows missing, and things like that. For example, Chicago's "crimes" dataset intentionally doesn't include all homicides. Can't remember the exact dataset, but I once had a conversation with Chicago's head of open data who told me that they intentionally removed many rows because they were concerned that the public was going to misinterpret the results... but didn't make it clear that rows were missing. So I guess everybody gets the opportunity to misinterpret the results!

FOIA is the better alternative because it gives you the original, pre-cleaned data. Open data is a lie.


This is super true. For my city’s portal as well. I’ve found one way around this by versioning the dataset - that is, committing the diffs in git. Credit to Simon Willison’s git-scraping technique.

I do this with my power company’s outage map: https://github.com/patricktrainer/entergy-outages

67k commits!

https://simonwillison.net/2020/Oct/9/git-scraping/


That's a really freaking neat trick. Thanks!


Hah that's classic politics "Hello John Q. Public, here's all our data! It speaks for itself" John Q. Public: "Wow, you really improved last few years homicide-wise" "And so you see, a third party unrelated to us has just confirmed what a great job we're doing with simple empirical, evidence-based governance!"

So that means what you want to do is specialize in identifying bias in these datasets and finding the smoking gun. Such a task can be an ugly business but necessary for the public good, pushing data sharers to either share good data, or not share, but not share tricksy data in this unethical way.


I worked in open data for quite a few years. This is a very weird take.

Open data portals generally have data is useful form. FOI probably gives you PDFs.


"FOI probably gives you PDFs."

Having submitted thousands of FOIA requests, I get the impression that you haven't, actually, submitted many FOIA requests. I've received many, many, many, many non-PDF FOIA responses.

Share me some of the open data you've worked with and I'd love to poke at it and tell you where it's wrong and where assumptions about its data is wrong.


Thousands?! Do you have a public list on everything?

Have you had to fight a lot of malicious compliance which balloons up your request count? Or do they typically require an incredibly narrow request that you have up submit N entries per topic?


What an unappealing offer. No thanks.


Not any different from being red-team'd, but you do you. But thanks for your input -- it makes more sense that your apparent reluctance to be challenged makes it clear why you think my take is odd.

Even still, I challenge you to challenge yourself to understand where your blind spots are. I've done it many times and have found significant problems with the open datasets I've worked with. If you think my take is weird, it's only because you're not looking or the data you're looking at is inconsequential.

To me, this stuff is literal life and death. If we make mistakes in our analysis because of misinformation from the source, then the lives and deaths of people we're trying to understand becomes tarnished. We can treat our neighbors better than that.


>your apparent reluctance to be challenged

There are lots of reasons someone doesn't want to be "challenged" by some blowhard on the internet. One of them, true in my case, is I don't even work in this area anymore, as I said in my original post.

I really hope you are nicer in person.


Fair.

Can I ask why exactly you think my take is "very weird"?

Your original post was exceptionally dismissive, without explanation, and your comment on FOI was said so confidently probabilistic that it struck me that you misunderstood what I was suggesting. pardon my aggressive response. I get a lot of similar dismissiveness whenever I interact with government agencies, often where I'm told that something doesn't exist, or "Just look at the data portal", while the data portal is intentionally missing the information I look for. I don't expect you to answer my question, but I hope you can try to understand where I'm coming from in my thoughts and opinions on open data. My intent was only to get you to share your thoughts further.


Look, for starters, the stuff we're talking about covers a pretty broad spectrum. Your framing of the question about "intentionally missing" stuff suggests that you're interested in transparency-style data: data that gives you insight into the operations of the government body. And yes, if you are looking for data that might reflect poorly on the organisation, an open data portal is generally not the place to go for it.

This HN item for instance, is not about that kind of data. The datasets in question tell you about the transport network, the services, the patronage, the history, all kinds of interesting stuff.

So I find it "weird" that you would respond to a good-faith effort of sharing tons of information about a public transport network with this hostile approach of disparaging open data portals, and advocating instead an approach which is extremely resource-intensive for government bodies, when it's completely uncalled for.

Yeah, if you want to investigate a government cover-up, or shine light on some terrible mismanagement of resources, go for your life and submit FOI requests. Your mention of having filed thousands of FOI requests suggests you have consumed many tens of thousands of hours of public servants' time, and I really hope the results justify it.


Lemme tell you a story.

Years ago during the pandemic early days, a harvard epidemiology student asked me to proof-read his paper that argued that covid-19 killed more white people than any other race. The dataset he used was the Cook County Medical Examiner dataset. There was a column in there for the race information. If you're curious how it's populated, I can share with you the information.

Previously, I'd FOIA'd the data and received many more columns of information including the names of the individuals who'd died which showed a very clear pattern that the race information on the open data portal was not always accurate for Hispanic-origin names. The details are complicated, and I'm happy to explain my fact checking methods, but the Harvard student's analysis was just flat wrong because it made assumptions that the race data was correct. It was not.

Their response was initially along the lines of, "even if it's 50% it's still going to be true". It ended up being more like 80%, showing that people with Hispanic-origin names were significantly more likely to die of COVID-19.

If you think your audience isn't academics at mega institutions who believe that open data is 100% accurate data, then you've made many incorrect assumptions and I encourage you to reconsider.


>your audience

"my audience"?

What makes you think I have an audience?


I hope you're a nicer person in-person, too.


Where I grew up the data for murders is curated in such a way that anybody that dies 24 after being attacked is not considered a ‘murder’. Tehy do this to reduce the statistical murder rate.


Can you say more about this?


Well now we know why crime is down


I can only imagine. Many ETLs are already messy in companies with better tooling and processes.

Would love to read more about your experience with Open Data. Any place where I can reach out?


Here's something about shotspotter data in Chicago: https://x.com/foiachap/status/1775296597850480663

And this one makes some rounds: https://mchap.io/that-time-the-city-of-seattle-accidentally-...

Feel free to reach out!


Although pre-cleaned data is often not reflective of reality and requires careful work to use, often requiring a lot more knowledge of the field.


But even if dataset is incomplete or not accurate, do you think we could at least get directionally right insights from such datasets?


Yes, of course there can be. But I cannot ignore the harms in doing so, by misrepresenting the data in a way that disallows others to understand what is or isn't there -- it happens regularly. These datasets are often used as a political tool and contracted with local universities to show that they're providing data... though not actually providing the accurate data. Simultaneously though, people who don't know data will champion the data as accurate because it comes from a university program.

Sometimes what can happen is that somebody inexperienced will try to make some assessment of the data and come to the exact wrong conclusion because they didn't know what not to trust. But it gets on the news anyway and damage is done.

We can do better than that.


Worked on a team that deployed crowdstrike agents to organize and... Yeah. One of the biggest problems we had was that the daemon would log a massive amount of stuff... But had no config for it to stop or reduce it.


This sort of philosophical question becomes more important when you think of second-order-and-beyond. Think of, in this case, that color is just a manifestation of experience. But that manifestation of experience applies to, for example, the beautiful smells that a rose throws out into the world. To a degree, the redness has an affectation on the world, which is separate to each person.

my theory is that these things are just questions of resolutions of varying latencies.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: