Hacker News new | past | comments | ask | show | jobs | submit | tytso's comments login

Single malt was the currency of choice when bribing and/or placating SRE's and hwops folks. For example, if a SWE botched a rollout that caused a multiple SRE's to get paged at 3am, a bottle of single malt donated to the SRE bar was considered a way of apologizing.


Single Malt, specially Islays, are still very much welcomed :)


There was definitely a certain amount of "I told you so" vibes, but I don't blame the author. It appears that he was attacked by a lot of Ello founders and fans for raising some cautionary notes. And as it turns out, he was right and they were wrong.

We would all like to have a model where users don't get charged money, and yet are not the product. But I haven't seen a model that works to date. In some cases, I don't mind my personal date getting sold; in other cases I pay money because the service is valuable. But I certainly make backups since I don't assume that even when I pay $$$, that the company might not go poof in the night....


I believe GP was referring to the Ello founder Budnitz, who said that line, as the "dick."

I agree. He was responding to perfectly justified -- and accurate -- criticism by saying how sad it is to be a person with such views of the world.


Yes. The author of the article knows how to write enjoyably.

The CEO he was quoting is the subject of my schadenfreude.


>> But I haven't seen a model that works to date

Sure you have. Amazon grew without giving stuff away for free. Customers paid (just below market rate) from day 1. This demonstrated the -convenience- of ecommerce. It had revenues from the first sale. Yes, it spent mountains of VC money on marketing and development, but -not- on just buying stuff for you so you think it'll be free forever.

Uber is the same, although it's less clear that users will pay gor what a ride really costs. (And their margin makes it attractive for competition)

In both cases though there us revenue from customers from day 1. You can wind prices up. It's really hard to "go from free to paid".


> I haven't seen a model that works to date

IRC. NNTP. SMTP. XMPP. HTTP.

It's just that nobody wants to work on protocols anymore. Ever since the world's richest was suddently a computer guy, no one wants to work on anything without a business model that includes taking complete control over what is built. A product, if you will.

In the background, there's always some geeks slaving away with new protocols and federated models. That will not become mainstream, not in our current society. But societies change over time. There is always hope.

Protocols, not products, people!


But one of the four freedoms is being able to modify/tweek things, including the model. If all you have is the model weights, then you can't easily tweak the model. The model weights is hardly the preferred form for making changes to update the model.

The equivalent would be someone which gives you only the binary to Libreoffice. That's perfectly fine for editing documents and spreadsheets, but suppose you want to fix a bug in Libreoffice? Just having the binary is going to make it quite difficult to fix things.

Simiarly, suppose you find that the model has a bias in terms of labeling African Americans as criminals; or women as lousy computer programmers. If all you have is the model weights of the trained model, how easily can you fix the model? And how does that compare with running emacs on the Libreoffice binary?


If all you have are the model weights, you can very easily tweak the model. How else are all these "decensored" Llama2 showing up on Hugging Face? There's a lot of value in a trained LLM model itself and it's 100% a type of openness to release these trained models.

What you can't easily do is retrain from scratch using a heavily modified architecture or different training data preconditioning. So yes, it is valuable to have dataset access and compute to do this and this is the primary type of value for LLM providers. It would be great if this were more open — it would also be great if everybody had a million dollars.

I think it's pretty misguided to put down the first type of value and openness when honestly they're pretty independent, and the second type of value and openness is hard for anybody without millions of dollars to access.


Well, by that argument it's trivially easy to run emacs on a binary and change a pathname --- or wrap a program with another program to "fix a bug". Easy, no?

And yet, the people who insist on having source code so they can edit the program and recompile it have said that for programs, having just the binary isn't good enough.


>suppose you find that the model has a bias in terms of labeling African Americans as criminals; or women as lousy computer programmers. If all you have is the model weights of the trained model, how easily can you fix the model?

That's textbook fine-tuning and is basically trivial. Adding another layer and training that is many orders of magnitude more efficient than retraining the whole model and works ~exactly as well.

Models are data, not instructions. Analogies to software are actively harmful. We do not fix bugs in models any more than we fix bugs in a JPEG.


Instructions is exactly what weights are. We just have no idea what those instructions are.


You can fine tune a model, you ve got way more power to do so given the trained model than starting from scratch and the raw data.


Next step will be to ask for GPU time. Because even with data, model code and training framework you may have no resources to train. "The equivalent would be" someone gives you the code, but no access to mainframe which is required to compile. Which would make it not open source(?) There are other variations, like original compiler was lost, current compilers aren't backward compatible. Does that make old open source code closed now?

In other words there should be a reasonable line when model is called open source. In extreme view it's when the model, the training framework, and the data are available for free. This would mean open source model can be trained only on public domain data. Which makes class of open source models very, very limited.

More realistic is to make the code and the weights available. So that with some common knowledge new model can be trained, or old fine tuned, on available data. Important note: weights cannot be reproduced even if original training data is available. It will be always a new model with (slightly) different responses.


Down voted, hmm... I'll add bit more then. Sometimes it's even good that model cannot be easily reproduced. Original developers usually have some skills and responsibility. While 'hackers' don't. It's easy to introduce bias into the data , like removing selected criminal records, and then publish model with similar name. That would be confusing, some may mistake fake one for the real.

PS: If I ever make my models open I can't open the data anyway. License on images directly prohibits publishing them.


They don't have to be low-end. You can buy higher-end Chromebooks, but they cost more money. Do people remember the "netbooks" that were-super cheap Windows laptops with 10 inch screens? Even if you install Linux on it, with the 512MB or 1G (or maybe 2GB for the really highly spec'ed out netbooks), there was real limits to what they could do.

If you want something super-cheap, then perhaps it won't be useful 5 or 10 years later. You get what you pay for; this isn't unique for Chromebooks.


Unfortunately, the supply chain often goes 3 and 4 levels deep. And by the time you get to companies that far in the supply chain, (a) no one has ever heard of that company, so the trying to threaten them with reputational damage doesn't really work (it will be some random set of chinese characters for a company in Shenzhen, for example), and (b) it will turn out that the team that wrote the device driver for that particular subcomponent in the SOC was disbanded as soon as the part was released, and 4 years later, half are working for a different company, and half were died during the COVID pandemic.

Sure, if you could set the Wayback machine back in time, and require that device driver be upstreamed, with enough programming information so it's possible to maintain the device driver, maybe it would be possible to upgrade to a newer kernel that doesn't have eleven hundred zero-day vulnerabilities. But meanwhile, back in the real world, very often there's not a whole lot you can do. So this is why it's kind of sad when people insist on buying Nvidia video chips that have proprietary blobs because performance, or power consumption, or whatever, instead of the more boring alternative that doesn't have the same eye-bleeding performance, but which has an open source device driver. Our buying choices, and the product reviewers that only consider performance, or battery life, etc., drives the supply chain, and the products that we get. And this is why we can't have nice things.


It might be worth taking a look at Bensonwood (https://bensonwood.com). They do some very impressive, high-end pre-fab homes, and they solve the "must fit in highway lanes" by shipping walls that have windows, electrical wiring, plumbing, etc., all already pre-installed in their factory in New Hampshire. When we investigated using them 3 years ago, they didn't support pre-installed CAT 5 wiring or Optical Fiber, but I wouldn't be surprised if they can do that now. :-)

This is all done using computer-controlled manufacturing equipment, much of which is imported from Europe, where they are much more advanced on this front than in the U.S. One of the advantages of having computer controlled nail guns and vaccuum operated "wall flippers" is that the construction tolerances are far tighter than if you have humans nailing in the shingles, sometimes while on a ladder 15 feet above the ground.

The downside, of course, is that they only today have their one factory in New Hampshire, and while the walls can be shipped trucks on highways, if you want to build a large, luxury pre-fab home in Arizona, the trucks have to travel a long way, and that adds to the cost. This hasn't stopped some of their customers, though. Take a look at their web site for some example houses that they have built --- it's a far cry from what most people think of when they hear about "pre-manufactured houses". These are not trailer park homes!


Cgroups are a lot more than just "namespaces". It is also the mechanism by which you can constrain how much CPU, Memory, Network Bandwidth, Storage IOPS or Throughput, etc., processes in a particular cgroup or container can use.


That's fair.

But I think the core point here is that, as with much of the Plan 9 design, by including a more elegant and powerful abstraction in the core design, the need for a much more powerful and much more complicated abstraction layer on was obviated, if not eliminated.


Disclaimer: I work for Google but nothing I say here is Google's opinion or relies on any Google internal information.

I'm not surprised that Workspace accounts weren't included in the initial rollout. Workspace setups have interesting requirements that aren't necessarily there for personal accounts. For example, under some circumstances, if an employee gets hit by a bus, and there is critical business data which is stored in the employee's account, an appropriately authorized Workspace admin is supposed to be able to gain access to the employee's account. But what is the right thing to do for passkey access? Especially if the user uses passkey to authenticate to some non-:Google resource like, say, Slack which has been set up for corporate use? Should the workspace admin be able to impersonate the corporate employee in order to gain access to non-Google resources via passkey? What about if the employee (accidentally) uses their corporate account to set up a passkey to a personal account, such as for example E*Trade? Maybe the Workspace admin should have a setting where passkey creation is disabled except for an allowlist of domains that are allowed for corporate workflows? It's complicated, and if I were the product manager, I'd want to take my time, understand all of the different customer requirements (where customer === the Workspace administrator who is paying the bills) before rolling out support for Workspace accounts.


> motion sickness is a major barrier

I recall a story from a colleague who knew some folks who had worked on the game System Shock (this was in the early nineties). System shock was one of the first games that had an engine that implemented real 3D physics; so when you threw a grenade, it would describe a real parabola. And you can lean around a corner and sneak a peak without exposing your entire body to enemy fire, and when you did that, the 1st person shooter rendering would realistically reflect that. They had an experimental version of the game that was hooked to a virtual reality headset at the time, and gave up on it because, as one of them joked, it was "virtual reality, real nausea".

This was 30 years ago, and things haven't improved since then.


"Descent" goes back 28 years and I remember getting pretty disoriented and a little nausea, worse than I ever experienced flying a small airplane.

"Descent" was the game where you blast robots in a very 3-d mine.

One oddity of "VR" is it initially attracts people with excellent visuospatial analysis skills; the problem is the majority of the population is not good at it. It would be like implementing a user interface based on bench pressing 275 pounds of real world weights; it would be an incredibly popular fad among people already qualified to participate, then the general public would LOL and that's it. So that's the problem selling VR to the general public; most folks aren't very good at solving maze puzzles and drawing 3D CAD drawings in their heads so a UI based on that will be a hard sell.


When I started out in web design and development in 1995, a lot of companies were showing early "VR" and 3D interfaces, touting them as the next great thing. Somehow, people got the idea that reaching around in 3D space for everything was better than just picking from a menu, a list, or an index -- like we have done for 1,000 years.

I feel like all the 3D hype is just that. While it could be fun in games in a holodeck-type environment (but probably not outside of that, 'cause physics), I don't think the majority of everyday human interactions with information are better off in 3D. Why would anyone think so? We don't read in 3D. We don't write in 3D. We don't make pictures in 3D. Why would a 3D interface be better?


There is an old AI joke about a robot, after being told that it should go to the Moon, that it climbs the tree, sees that it has made the first baby steps towards being closer to the goal, and then gets stuck.

The way that people who are trying to use ChatGPT is certainly an example of what humans _hope_ the future of human/computer interaction should be. Whether or not Large Language Models such as ChatGPT is the path forward is yet to be seen. Personally, I think that model of "every-increasing neural network sizes" is a dead-end. What is needed is better semantic understanding --- that is, mapping words to abstract concepts, operating on those concepts, and then translating concepts back into words. We don't know how to do this today; all we know how to do is to make the neural networks larger and larger.

What we need is a way to have networks of networks, and creating networks which can handle memory, and time sense, and reasoning, such that the network of networks has pre-defined structures for these various skills, and ways of training these sub-networks. This is all something that organic brains have, but which neural networks today do not..


> What is needed is better semantic understanding --- that is, mapping words to abstract concepts, operating on those concepts, and then translating concepts back into words. We don't know how to do this today; all we know how to do is to make the neural networks larger and larger.

It's pretty clear that these LLMs basically can already do this - I mean they can solve the exact same tasks in a different language, mapping from the concept space they've been trained on english in to other languages. It seems like you are awaiting a time where we explicitly create a concept space with operations performed on it, this will never happen.


> that is, mapping words to abstract concepts, operating on those concepts, and then translating concepts back into words

I feel like DNNs do this today. At higher levels of the network they create abstractions and then the eventual output maps them to something. What you describe seems evolutionary, rather than revolutionary to me. This feels more like we finally discovered booster rockets, but still aren't able to fully get out of the atmosphere.


They might have their own semantics, but its not our semantics! The written word already can only approximate our human experience, and now this is an approximation of an approximation. Perhaps if we were born as writing animals instead of talking ones...


This is true, but I think this feels evolutionary. We need to train models using all of the inputs that we have... touch, sight, sound, smell. But I think if we did do that, they'd be eerily close to us.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: