Show HN: Sheet Music in Smart Glasses

194 points by kevinlinxc a day ago

Hi everyone, my name is Kevin Lin, and this is a Show HN for my sheet music smart glasses project. My video was on the front page on Friday: https://news.ycombinator.com/item?id=43876243, but dang said we should do a Show HN as well, so here goes!

I’ve wanted to put sheet music into smart glasses for a long time, but the perfect opportunity to execute came in mid-February, when Mentra (YC W25) tweeted about a smart glasses hackathon they were hosting - winners would get to take home a pair. I went, had a blast making a bunch of music-related apps with my teammate, and we won, so I got to take them home, refine the project, and make a pretty cool video about it (https://www.youtube.com/watch?v=j36u2i7PKKE).

The glasses are Even Realities G1s. They look normal, but they have two microphones, a screen in each lens, and can be even made with a prescription. Every person I’ve met who tried them on was surprised at how good the display is, and the video recordings of them unfortunately don’t do them justice.

The software runs on AugmentOS, which is Mentra’s smart glasses operating system that works on various 3rd-party smart glasses, including the G1s. All I had to do to make an app was write and run a typescript file using the AugmentOS SDK. This gives you the voice transcription and raw audio as input, and text or bitmaps available as output to the screens, everything else is completely abstracted away. Your glasses communicate with an AugmentOS app, and then the app communicates with your typescript service.

The only hard part was creating a Python script to turn sheet music (MusicXML format) into small, optimized bitmaps to display on the screens. To start, the existing landscape of music-related Python libraries is pretty poorly documented and I ran into multiple never-before-seen error messages. Downscaling to the small size of the glasses screens also meant that stems and staff lines were disappearing, so I thought to use morphological dilation to emphasize those without making the notes unintelligible. The final pipeline was MusicXML -> music21 library to render chunks of bars to png -> dilate with opencv- > downscale -> convert to bitmap with Pillow -> optimize bitmaps with imagemagick. This is far from the best code I’ve ever written, but the LLMs attempt at this whole task was abysmal and my years of Python experience really got to shine here. The code is on GitHub: https://github.com/kevinlinxc/AugmentedChords.

Putting it together, my typescript service serves these bitmaps locally when requested. I put together a UI where I can navigate menus and sheet music with voice commands (e.g. show catalog, next, select, start, exit, pause) and then I connected foot pedals to my laptop. Because of bitmap sending latency (~3s right now, but future glasses will do better), using foot pedals to turn the bars while playing wasn’t viable, so I instead had one of my pedals toggle autoscrolling, and the other two pedals sped up/temporarily paused the scrolling.

After lots of adjustments, I was able to play a full song using just the glasses! It took many takes and there was definitely lots of room for improvement. For example: - Bitmap sending is pretty slow, which is why using the foot pedals to turn bars wasn’t viable. - The resolution is pretty small, I would love to put more bars in at once so I can flip less frequently. - Since foot pedals aren’t portable, it would be cool to have a mode where the audio dictates when the sheet music changes. I tried implementing that with FFT but it was often wrong and more effort is needed. Head tilt controls would be cool too, because full manual control is a hard requirement for practicing.

All of these pain points are being targeted by Mentra and other companies competing in the space, and so I’m super excited to see the next generation! Also, feel free to ask me anything!

eitally a day ago

Is there an opportunity to partner with (or sell to) one of the big digital sheet music vendors (like Musescore or Music Notes, etc)? I've never come upon a compelling personal use case for smart glasses, but as a pianist this could be it. I would HAPPILY purchase both glasses and a subscription from one of the big music vendors if this worked seamlessly and I could do things like embed a metronome or link it to my DAW so I could control things like tempo, rewind, even key transposition.

  • fennecfoxy 8 hours ago

    I feel like this would be sold as more of an app for a smart glasses platform than an individual product.

    >I've never come upon a compelling personal use case for smart glasses

    There are tonnes, it's just the technology isn't there yet; glasses are too bulky and heavy, the fov sucks, the resolution sucks, light transmittance sucks.

    But the use cases are incredibly plentiful; stuff like this (music sheets, documentation, web browsing), getting realtime directions with a blue line or directional hints when walking around an unfamiliar place, overlays/information at tourist sites, home automation/controlling devices.

    I remember an old anime or some show where it's a world where a digital world is overlaid the real world where AIs and devices from the digital layer can be interacted with in a similar way...what was it hmmm.

  • kevinlinxc a day ago

    This would make the most sense, since MuseScore is notoriously litigious about usage and redistribution of their library/MusicXMLs, so a collaboration would be necessary to get a usable music catalog for smart glasses

  • adrianh 20 hours ago

    Just a quick plug: check out Soundslice. It's interactive sheet music with a ton of learning tools built in, including easy navigation, looping, tempo changing and transposition.

    We've also got a scanning feature that does OCR for sheet music, to get music into our system. Plus there's a full-featured notation editor. A good overview is at https://www.soundslice.com/features/

pedalpete 19 hours ago

This is such a great use case. I stand in front of my monitor with my guitar and have to scroll the sheet music. So that means stop playing. I often wander away from my computer and then come back if I forget how a section goes.

I'm using tabs not notes, but I'm assuming/hoping your solution will adapt quite easily.

I wonder if you could use a microphone to listen for the notes in order to get auto-scrolling. Because you know the general timing, you're not searching through the entire song (likely) but honing down on the exact point that person is at. An inobtrusive metronome might be nice to.

Congats! One of the best projects I've seen in a long time, and particularly such a good use case for the early stage of this hardware.

  • kevinlinxc 11 hours ago

    Thanks for the kind words! As I briefly mention in my video, my teammate actually had guitar tabs going, with lyrics chords and even web scraping/search. I think a bit of refinement and better hardware and we'll get what you're looking for

mdp2021 18 hours ago

Nice project, but I do not get the competitiveness of part of its implementation details:

-- the project uses "Even Realities G1" AR glasses (640x200, 25°FoV, 1bit green), while the "Epson Moverio" AR glasses can have overwhelmingly superior specs (1920x1080, 34°FoV, full RGB) for possibly an even lower price;

-- software wise, it «uses AugmentOS's SDK to communicate with Mentra servers which talk to the mobile app which talks to your ... glasses» - while an Epson Moverio system would just directly use the glasses as a display for an Android device...

Both gaps between the available and the employed make very little sense.

  • alex1115alex 16 hours ago

    You can absolutely find more feature-packed glasses. Epson, XReal, Viture, Vuzix Blade, TCL RayNeo X2/X3, INMO Air 1-3, etc. The issue is all these weigh north of 70g (too heavy for comfortable daily wear), and/or require a wire that connects to an external compute box (not socially acceptable for daily wear). Glasses like the Even Realities, otoh, are lightweight, look normal, and are wireless, so can be worn all-day as you would normal prescription glasses.

  • turtlebits 17 hours ago

    Makes perfect sense- usability. Look at both sets of glasses.

    One looks normal enough to wear all the time.

Aidevah 13 hours ago

Great job! For converting music to readable images, the latex of music typesetting is lilypond, which has the ability to create legible music at any size by scaling the notational glyphs accordingly[1]. This sounds like what you were trying to achieve achieve with opencv.

With that being said, although lilypond is very intelligent about all sorts of typesetting minutiae, but it's probably difficult to wrangle it to run on smart glasses.

[1] https://lilypond.org/doc/v2.24/Documentation/essay/engraving...

  • kevinlinxc 11 hours ago

    I tried using lilypond actually but ran into an error that no one seemingly knew how to solve. Can try digging through my history if you're interested

KyleBrandt a day ago

A full orchestra on stage playing with no music stands sure would be make for a nice sight (assuming the glasses looked like regular old glasses -- (or maybe blues brothers shades)).

  • kevinlinxc a day ago

    Agreed! These glasses do look very normal - only tell is that at a certain angle you can see the green of the screen, and the part near the ear is a bit bigger (but easy to conceal with hair)

floren a day ago

Three seconds to send a bitmap? And I thought the Brilliant Monocle/Frame was slow! In the video it looks like you don't get more than a bar or two on-screen at a time... wouldn't any reasonably fast piece outpace the rate at which you can get the next bar on the device?

  • kevinlinxc a day ago

    Yeah, it's a big deal for sure, I was bugging Mentra all hackathon to try and lower it, and also reached out to Even for suggestions (which Mentra is implementing). Regardless, I made it work and next gen hardware, firmware and software are all definitely going to be better for bitmaps

    • floren a day ago

      If they're using the same Nordic BLE chips everybody else is, there's just gonna be a cap on how quickly you can move stuff, I think.

      I've found the display capabilities of the current gen smartglasses pretty disappointing. Yes they're less obtrusive, but the resolution is pitiful. I've found the Vufine a lot more useful, if more ridiculous looking.

      • alex1115alex a day ago

        Mentra here.

        The Nordic MCU they use isn't actually the limiting factor, rather it's the glasses' firmware. For bitmaps from third party apps (like AugmentOS), they enforce 194 byte chunk sizes and do not support RLE. Their first-party app does not have these limitations. We're stuck with this problem for the G1, but we're working with hardware partners to make sure future glasses don't have these issues.

      • kevinlinxc a day ago

        If I were designing around this limit, I would put enough memory to be able to store a nice buffer of bitmaps in either direction and then do sends that don't change what's currently displayed. I feel like that memory probably exists, I just don't have access to the firmware sadly

analog31 20 hours ago

Very cool idea and demo.

The ability to adapt paper music would be useful. In some genres -- I play big-band jazz -- virtually no material is available in printed form, or it's in the composer's preferred format, which is typically PDF.

paul7986 a day ago

Great and cool to see this, as well see some fellow smart glass enthusiasts on Hacker News.

I've been an avid enthusiast and promoter of Meta Ray Bans since Oct 2023. They are very handy and I think for anyone person who wear sunglasses or glasses and uses their phone to take pics or vids then they make a ton of sense (both things you can do with them without needing your phone.. also ask them for the time). Though Im not sure even the HN population is much about them.

Albeit I love them I do not think as you see the media and i guess Zuckerberg saying they are the next computing platform that to be true. You can not take selfies with smart glasses unless they offer a pop out tiny drone in the glasses to take pics of u lol. Thus, I think they will be complementary to our personal pocket smart and or upcoming pocket AI devices, which will able to take the best selfies of you ever (ur AI friend see on the lock screen directs you to the best light to get the best selfies).

MarcelOlsz 17 hours ago

This would be incredibly useful for sight reading.

theyknowitsxmas 21 hours ago

Instead of 30 pedals, give the conductor a butt-on.