More

phonon · 2024-11-16T01:50:57 1731721857

> The box of Unsalted Butter already says "Milk" in the ingredients list.

No it doesn't, it says "Cream". That's the issue.

phonon · 2024-11-13T01:20:53 1731460853

You make sure to use a Japanese specific font (not just a CJK one) if the language setting is JA. It's not that hard...

lmm · 2024-11-13T01:56:13 1731462973

> You make sure to use a Japanese specific font (not just a CJK one) if the language setting is JA.

What "language setting", and how do you check it? Do your testers know that they have to test this Japan-specific thing, and how to even tell whether it's working or not? And what about users who need to read Japanese, but don't want their UI to be in Japanese - or, worse, users who need to read both Japanese and Chinese?

numpad0 · 2024-11-13T05:25:39 1731475539

You have to render strings in (Chinese|Japanese) font if you believe the string to be meant to be (Chinese|Japanese). Literally that. That's the official Consortium sanctioned way to handle Chinese/Japanese/Korean characters.

There are no particularly good ways or ready made frameworks for that, as it wasn't a huge issue pre-Internet because most people are monolingual in these languages: you pick a language in OS(or buy computers with a ROM) and everything user would see was in the user's language.

It's a giant pain today - there's no "Arial in Chinese", no easy way to mesh multiple fonts together in UI, or good ways to determine intended language of a string, and the fallback default is least common denominator of Simplified Chinese(PRC) for some reason - but not much is being done on those fronts.

numpad0 · 2024-11-14T03:29:17 1731554957

I realized I can add a bit more here in hope it'll be useful to someone at some point, all [citation needed]:

It seems that there are Chinese regulation(?) that require Chinese text to be _always_ displayed in _appropriate_ font, which causes "wrong font" problem to Japanese users, while there are no such equivalent requirements elsewhere that computer software face no hard dilemma to follow two regulations in same code point; one is law and one is just unanimous customer complaints.

The rationale for that Chinese regulation(?), I think, is that Chinese computer users long had an exaggerated version of the same problem due to how Kanji was adopted to Unicode; the Kanji map was created by first enumerating common-use Japanese Kanji from reasonable existing tables, then merging additional Chinese common use characters not found on Japanese texts which are plenty. Duplicates were merged on loose and pragmatic-at-time judgements, some by shapes, some by meanings(!), some left as duplicates.

It seems to me that this had lead to a situation that lasted at least some period of time, that Chinese UTF-8 strings on a computer displayed in a mishmash of Japanese and Chinese fonts, Chinese one filling the gaps of Japanese rather than the other way around, which was frustrating(imagine top 10 most used characters of alphabets in Comic Sans and rest Arial), and solved by that regulation(the regulatory setup is a Chinese national GB or GB/T standard and enforcement of "applicable industrial standards", I believe).

There are such attempted solutions to this problem of same-system Chinese-Japanese text coexistence, such as registering all the Japanese Kanji as first choices on Unicode IVS map along legitimate Japanese variants, such that each Japanese characters with the IVS suffix sequences per each characters would be in Japanese form. Obviously there's no such font that this is going to work well with, and it's also unnecessary bloat and just a committee backstage influencing war.

The moral of the story is, the "it's all Kanji after all" approach never worked, simultaneous Chinese-and-Japanese support in Unicode and Unicode-based apps is a mess, and it has to be fixed higher up at some point in the future.

phonon · 2024-11-16T02:09:43 1731722983

I don't disagree. The only practical strategy now is that if you know the text/user is Japanese, use a specifically Japanese font (Windows and MacOS have a few built in options), not just one that has the CJK codepoints.

rjh29 · 2024-11-13T04:02:53 1731470573

For example on an android phone, set language to Japanese and all kanji is in a Japanese font by default.

For more niche uses you can usually set the font or language on a per app basis.

Every Japanese phone is using Unicode already, as is most modern PC software.

lmm · 2024-11-13T04:16:47 1731471407

> For example on an android phone, set language to Japanese and all kanji is in a Japanese font by default.

If it's an app that uses the native UI toolkit, sure. If it's using one of the package-it-up frameworks, you'd better hope the developer configured it correctly.

> For more niche uses you can usually set the font or language on a per app basis.

You very often can't, or it's impractically difficult for regular users. Try changing your locale but not your language and watch how many programs screw it up.

Tor3 · 2024-11-13T12:15:31 1731500131

> You make sure to use a Japanese specific font (not just a CJK one) if the language setting is JA. It's not that hard...

I need to use Japanese language in a setting outside of the local language setting. Even my wife needs that (she's a native Japanese). Just switching everything to JA is simply not an option, and shouldn't be necessary if just UTF-8 could do the right thing. Granted, it does, to a certain point. But sometimes there are issues which I haven't been able to work around.

phonon · 2024-11-10T20:30:23 1731270623

This overlooks CSS Paged Media based options like paged.js, weasyprint, etc. (You can find the full list here..some open source, some commercial)[0]

[0]https://www.print-css.rocks/tools

xiaohanyu · 2024-11-11T02:43:10 1731292990

Author here.

I mentioned https://polytype.dev/ in the end of the post, which has pages.js included.

Is not that hard to simulate pagination with JavaScript, the deal breaker for me is still line breaking and also mixed languages typesetting nuances.

phonon · 2024-11-11T05:10:11 1731301811

That's pretty indirect....you might want to look more closely... https://drafts.csswg.org/css-page/ covers quite a lot...

You also didn't mention the new (Chrome only) CSS text-wrap: pretty

https://developer.chrome.com/blog/css-text-wrap-pretty

https://docs.google.com/document/d/1jJFD8nAUuiUX6ArFZQqQo8yT...

https://chromestatus.com/feature/5145771917180928

phonon · 2024-11-10T08:07:08 1731226028

https://news.ycombinator.com/item?id=37559371

airstrike · 2024-11-10T10:49:52 1731235792

Hah! It wasn't open sourced then, but I did buy the book and followed his advice!

nhatcher · 2024-11-10T10:45:38 1731235538

Oh wow! Good find!

phonon · 2024-11-10T07:42:33 1731224553

Have you looked at these other somewhat similar Rust projects?

https://github.com/logisky/LogiSheets

https://github.com/natefduncan/excel-emulator

https://github.com/jiradaherbst/XLFormula-Engine

https://github.com/omid/formula

https://crates.io/crates/df-web/0.1.28

nhatcher · 2024-11-10T11:45:51 1731239151

I _think_ I saw the XLFormula-Engine. At least the name rings a bell.

I guess I have some reading to do!

What is the last one?

phonon · 2024-11-10T19:27:11 1731266831

I'm not exactly sure...it seems to handle Excel files in some fashion. It came up when I looked through this https://crates.io/keywords/excel

phonon · 2024-11-09T22:26:37 1731191197

This looks great! Do you use cached calculation chains for performance optimizations? Do you take volatile functions into account?

https://learn.microsoft.com/en-us/office/vba/excel/concepts/...

nhatcher · 2024-11-09T22:36:34 1731191794

> Do you use cached calculation chains for performance optimizations?

Not yet, there is heavy research in that direction. I will write on this soo-ish

> Do you take volatile functions into account?

Yes, for instance RANDBETWEEN and NOW are implemented. Things like `IF(RANDBTWEEN(1, 500)> 200,A1, A2)` work fine

Thnaks

8n4vidtmkvmk · 2024-11-09T23:00:20 1731193220

What does that actually mean, "works"? I don't know how that behaves in Google sheets or Excel. Is it evaluated exactly once the first time the formula is entered? Every time you focus the input? Is the dice rerolled when a1 or a2 is modified? What?

nhatcher · 2024-11-09T23:30:07 1731195007

Hi 8n4vidtmkvmk, the algorithms for evaluating spreadsheets are surprisingly tricky mainly because of the dependencies. The dependencies are only know at runtime and in Excel are lazy evaluated. So things like `IF(condition, value1, value2)` would evaluate first the condition if it is true it will evaluate value1 but not value2. So things that in other programming languages are a circular dependency are not so in Excel. The problem of computing the dependencies might be solved by topological sort. The complication of the runtime dependencies is made worse by having dependencies that change every time (or that their outputs do not dependency solely of their inputs) like random functions or date functions. An optimization while evaluating a spreadsheet would be to only compute those cells that depend on cells whose value changed. If you do that you might miss on those volatile functions.

I realize I am most likely babbling too much.

Yes, volatile functions like RANDBETWEEN get evaluated each time a cell changes. They don't get evaluated when you focus on them.

8n4vidtmkvmk · 2024-11-10T17:43:21 1731260601

Thanks. I wonder if it would make sense for RANDBTWEEN to take a seed so that it becomes non-volatile unless you use time as the seed.

phonon · 2024-11-06T21:16:57 1730927817

For $1000 per month you can get a c8g.12xlarge (assuming you use some kind of savings plan).[0] That's 48 cores, 96 GB of RAM and 22.5+ Gbps networking. Of course you still need to pay for storage, egress etc., but you seem to be exaggerating a bit....they do offer a 44 core Broadwell/128 GB RAM option for $229 per month, so AWS is more like a 4x markup[1]....the C8g would likely be much faster at single threaded tasks though[2][3]

[0]https://instances.vantage.sh/aws/ec2/c8g.12xlarge?region=us-... [1]https://portal.colocrossing.com/register/order/service/480 [2]https://browser.geekbench.com/v6/cpu/8305329 [3]https://browser.geekbench.com/processors/intel-xeon-e5-2699-...

spwa4 · 2024-11-07T09:36:59 1730972219

Wouldn't c8g.12xlarge with 500g storage (only EBS is possible), plus 1gbps from/to the internet is 5,700 USD per month, that's some discount you have.

If I try to match the actual machine. 16G ram. A rough estimate is that their Xeon E3-1240 would be ~2 AWS vCPU. So an r6g.large is the instance that would roughly match this one. Add 500G disk + 1 Gbps to/from the internet and ... monthly cost 3,700 USD.

Without any disk and without any data transfer (which would be unusable) it's still ~80USD. Maybe you could create a bootable image that calculates primes.

These are still not the same thing, I get it, but ... it's safe to say you cannot get anything remotely comparable on AWS. You can only get a different thing for way more money.

(made estimates on https://calculator.aws/ )

phonon · 2024-11-07T17:13:26 1730999606

What do you mean by "1gbps from/to the internet"?

125 MB per second × 60 seconds per minute × 60 minutes per hour × 24 hours per day x 30 days = 324 TB?

If you want 1 Gbps unmetered colo pricing, AWS is not competitive. So set up your video streaming service elsewhere :-)

https://portal.colocrossing.com/register/order/service/480 offers unmetered for $2,500 additional per month, for the record.

If you have high bandwidth needs on AWS you can use AWS Lightsail, which has some discounted transfer rates.

spwa4 · 2024-11-07T21:50:03 1731016203

Even just the compute, without even disk, is barely competitive.

phonon · 2024-11-08T21:38:54 1731101934

I'm not sure I understand your point anymore.

petcat · 2024-11-06T22:12:35 1730931155

> That's 48 cores

That's not dedicated 48 cores, it's 48 "vCPUs". There are probably 1,000 other EC2 instances running on those cores stealing all the CPU cycles. You might get 4 cores of actual compute throughput. Which is what I was saying

phonon · 2024-11-06T22:24:14 1730931854

That's not how it works, sorry. (Unless you use burstable instances, like T4g) You can run them at 100% as long as you like, and it has the same performance (minus a small virtualization overhead).

petcat · 2024-11-06T22:48:32 1730933312

Are you telling me that my virtualized EC2 server is the only thing running on the physical hardware/CPU? There are no other virtualized EC2 servers sharing time on that hardware/CPU?

phonon · 2024-11-06T23:16:28 1730934988

If you are talking about regular EC2 (not T series, or Lambda, or Fargate etc.) you get the same performance (within say 5%) of the underlying hardware. If you're using a core, it's not shared with another user. The pricing validates this...the "metal" version of a server on AWS is the same price as the full regular EC2 version.

In fact, you can even get a small discount with the -flex series, if you're willing to compromise slightly. (Small discount for 100% of performance 95% of the time).

petcat · 2024-11-07T00:00:37 1730937637

This seems pretty wild to me. Are you saying that I can submit instructions to the CPU and they will not be interleaved and the registers will not be swapped-out with instructions from other EC2 virtual server applications running on the same physical machine?

doctorpangloss · 2024-11-07T02:49:48 1730947788

Only the t instances and other VM types that have burst billing are overbooked in the sense that you are describing.

nostrebored · 2024-11-07T02:41:26 1730947286

Yes — you can validate this by benchmarking things like l1 cache

phonon · 2024-11-07T02:03:39 1730945019

Welcome to the wonderful world of multi-core CPUs...

phonon · 2024-11-06T20:28:59 1730924939

LPCAMM2 is available.

angoragoats · 2024-11-06T23:16:24 1730934984

LPCAMM2 is also limited to 2 memory channels, so that won’t work either.

There’s a reason I said “proprietary.”

phonon · 2024-11-06T23:21:35 1730935295

A single LPCAMM2 is a 128 bit total memory channel. The M4 uses a 128 bit memory channel.

You can have 2 LPCAMM2 slots for twice as many memory channels, like the M4 Pro.

angoragoats · 2024-11-06T23:59:15 1730937555

Yes, I am aware. M4 Max would need 4 LPCAMM2 modules, and a hypothetical M4 Ultra would need 8. This sounds unrealistic (especially 4 modules in a laptop), which is why I mentioned a proprietary connector instead. There is precedence for this type of thing in the Mac Studio where the SSD NAND chips are on proprietary removable modules.

phonon · 2024-11-06T16:42:14 1730911334

A CAMM2 memory based design would likely work just as well.

phonon · 2024-11-06T02:38:52 1730860732

An M4 Macbook Pro 14 with 32 GB of RAM and 1 TB storage is $2,199... a Lunar Lake with the same specs is $1199. [0]

[0] https://www.bestbuy.com/site/asus-vivobook-s-14-14-oled-lapt...

saagarjha · 2024-11-07T08:40:47 1730968847

Those are not nearly comparable specs.

bigfatkitten · 2024-11-06T19:36:36 1730921796

With a build quality planets apart.

phonon · 2024-11-06T20:35:37 1730925337

My point is it's not "just pay a bit more".

stackghost · 2024-11-06T06:29:27 1730874567

Yeah because it's an ASUS product. They make garbage.