Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Purego – A library for calling C functions from Go without Cgo (github.com/ebitengine)
268 points by weitzj on Feb 12, 2023 | hide | past | favorite | 69 comments


Very cool! I will definitely give this a try, I've been looking to build Go bindings to Mach[0] soon.

It looks like this would make cross-compiling CGO easier (no target C toolchain needed?)

Does this do anything to reduce the overhead of CGO calls / the stack size problem? IIRC the reason CGO overhead exists is at least partly because goroutines only have an ~8k stack to start with, and the C code doesn't know how to expand it-so CGO calls "must" first have the goroutine switched to an OS thread which has an ~8MB stack.

One reason I think Go <-> Zig could be a fantastic pairing is that Zig plans to add a builtin which tells you the maximum stack size of a given function[1], so you could grow the goroutine stack to that size and then call Zig (or, since Zig an compile C code, you could also call C with a tiny shim to report the stack required?) and then eliminate the goroutine -> OS thread switching overhead.

[0] https://github.com/hexops/mach

[1] https://github.com/ziglang/zig/issues/157


Contributor here: Purego doesn’t do anything to improve the overhead of calling into C. It uses the same mechanisms that Cgo does to switch to the system stack and then call the C code. Purego just avoids having to need a C toolchain to cross compile code that calls into C from Go.

I’ve actually been quite interested in Zig. If that built-in was added than it would likely be possible to grow the goroutine stack to the proper size and than call the Zig code. Very interesting stuff!


Makes sense! I also wonder (if you know): last I looked I recall that each CGO call requires switching to the system stack, but I can't recall what happens after. Does it switch back to a regular goroutine stack once the syscall has completed?

I wonder if a more tailored CGO implementation could pin a goroutine to a thread which is guaranteed to have a system stack available, so that each CGO call need not worry about that switching at all. Maybe that'd require runtime changes though?


Stack switching isn't that much of the overhead. "ordinary" cgo overhead is <100ns now, has been for a few years, and is much closer to 30 than 80 on recent processors. Most of the overhead is a set of 4 CAS operations (incidentally this means that AMD has measurably lower cgo overhead because of something with its caching model I don't understand).

If cgo's only overhead was the "ordinary" overhead, most people wouldn't have an issue with it. It's downright zippy, in fact... as long as your syscall/C call takes less than 1us. If you stay under the 1us threshold, go will put the OS thread used for the syscall back where it found it and everything moves on.

The issue is that the OS thread was previously serving N goroutines that other parts of the program may be waiting on to move forward, and the OS thread is in a state where go can't pre-empt it and allow those other goroutines to move forward, and it has no idea how long it will be until it can move forward.

As a result, if a syscall/c call takes longer than 1us, go has no choice at this time but to resume a new thread, context switch all the old work onto that thread, and then suspend the syscall thread when it comes back. If you do this a lot, your performance will crater.

There's also separately a few issues around how go chooses to resume/suspend OS threads (for instance, if an os-locked goroutine does coooperative park for any reason to wait on another thread to do something, go will suspend the thread it was on, context switch to a different thread, then when the goroutine wakes up, it will realize its mistake, resume the thread it was on and context switch again).

This is all fixable stuff, but all the use cases that google cares about are working fine so it doesn't really get any attention.


Yeah the default behavior is to switch back. It’s possible to pin a goroutine to a thread with runtime.LockOSThread(). However I don’t believe it avoids the stack switching. It’s purpose it to make sure that Thread Local Storage works properly. The runtime is pretty smart though so it might already do the optimization you suggested in someway. I know it has a goroutine specifically for monitoring if a thread is stuck in a external call and therefore spawns a new thread to continue work (sysmon)


Ahh that's right, I see.

Maybe I should play with removing the stack switching from purego so that under condition of a locked OS thread you can avoid that overhead :) I might give that a shot sometime


All threads have a system stack available at all times.


Nice, just-in-time goroutine -> OS thread switching.


I thought I had a piece of dust on my screen, but as I scrolled the dust scrolled: what do these 0xB7 characters do in the identifiers? Are they just "name mangling" to keep them from being exported or something?

https://github.com/ebitengine/purego/blob/v0.2.0-alpha/sys_d...

I noticed another 0xB7 character in a comment, and sure enough it seems to be part of the identifiers: https://github.com/ebitengine/purego/search?q=runtime%C2%B7c...


"In Go object files and binaries, the full name of a symbol is the package path followed by a period and the symbol name: fmt.Printf or math/rand.Int. Because the assembler's parser treats period and slash as punctuation, those strings cannot be used directly as identifier names. Instead, the assembler allows the middle dot character U+00B7 and the division slash U+2215 in identifiers and rewrites them to plain period and slash."[0]

[0] - https://go.dev/doc/asm


This is the most awful thing I’ve encountered all day.


That's creative. I'm not sure if I should call it abuse, but I'm leaning heavily in favor.


It's name mangling yeah. I believe Go uses those characters in place of where it might use dots.


I did not know that this could reasonably be done. For some reason it did not occur to me that you could break the chicken and egg problem by simply linking to libdl dynamically; Go binaries are usually static and I didn't even realize it had a mechanism for dynamically linking like this.

This is pretty cool because you can already do this sort of thing on Windows (using the syscall package, since the Windows Loader is always available from kernel32 anyways) and I use it all the time. Probably the most consequential thing I've done with it is my WebView 2 bindings. But with this, you could probably do the same thing on Linux and Mac with GtkWebkit and ... WebKit, and get a native HTML window without CGo on Windows, Mac, and Linux. Perhaps this has already been done (haven't paid attention) but it would make a pretty nice way to get a UI going in Go. (It's not like I'm a fan of using HTML UIs for native apps, but it works pretty well if you don't overdo it, and using native widgets on a given platform does mess up predictability a bit, but it saves disk space at least.)


Could I suggest adding a "Motivation" section to the README?

It made sense after reading the comment about not needing a C toolchain for cross-compiling CGO but I didn't realize it immediately.

Neat stuff


An explanation of how this works and what makes it novel would be nice. I'm not familiar enough to understand how this is better than Cgo.


It loads the dynamic library at runtime, instead of linking against it, which means it makes cross-compiling with CGO easier as no target C toolchain is needed.


What slimsag wrote is correct. It makes cross-compiling code that needs to call C functions as easy a setting the GOOS and GOARCH and just building. This means no need to worry about building a C cross-compiler.

I do want to write an article about how purego works under the hood.


I'll be on the lookout. Where's your blog / twitter? I don't see one linked to in your GitHub profile, either.


I don't have either. I was gonna figure out how to post it after I actually sat down and wrote it lol. I'll probably post it in the golang subreddit and maybe link to it in the README.md since it describes how purego works.


Works. Thanks. Btw, if you haven't considered then, substack.com, hashnode.dev, dev.to are pretty good eng blogging platforms.


If memory serves, dev.to tends to be downranked or outright filtered by a bunch of places.

(I have no idea how or why that came about, I've merely observed people having all sorts of trouble getting posts on there visible in aggregators and etc.)


D does it the easy way:

    extern (C) size_t strlen(const char*);

    ...
    size_t length = strlen(p);
You can call any C code like that. You can even simply import C code!


You can do that fine too with Cgo. This looks like more a binding for the dynamic linker.


Interestingly it’s easy to do this with just standard library on Windows: syscall.NewLazyDLL plus NewProc are enough. (Of course in practice you should probably use golang.org/x/sys instead.) I’ve never thought about why dlopen isn’t offered in syscall on *nix until now.


Awesome!

This would be such a game-changer for server-side rendering Javascript in Go with V8.

I'd love to integrate this into Bud[1].

[1] https://github.com/livebud/bud


This could be very useful for a project I am just starting on.

No documentation, and example has no content makes the learning curve a bit steep.

Does anyone have any pointers on how to use this?


It’s pretty simple to use if you are familiar with dlopen and friends.

Just call purego.Dlopen(“libname.so”, purego.RTLD_GLOBAL)

Take the returned library (make sure to check for errors with purego.Dlerror() first) and call purego.Dlsym(lib, “cfuncName”). If it exists than u can call it with either purego.SyscallN or purego.RegisterFunc


It would be good to have more documentation on usage, though; things like how to deal with struct padding (or packed structs), common OS API types (presumably manual munging of UCS2/UTF16 is needed for Windows), etc; at least to mention that it's unchanged from …/x/sys?

It's easier with dlopen because it's still C and therefore you have the normal headers…


+1 for additional examples or documentation.

Particularly an example that takes a c struct pointer would be awesome.

What happens to const char* return values that are null ? I think it is empty string, but either test case or doc confirming it would be awesome


The example seems straightforward: Include this package, and your usual os calls start going through their "fakecgo" path.


Quite interesting. That's one more option to consider besides using wasm as an intermediary.

Thinking about this as I'd like to call native android libraries.


How is this any different than a mature tool such as SWIG (https://www.swig.org/)?

I've used SWIG extensively with Python to call C code and import C headers for testing/tooling purposes.


SWIG generates binding code. This is dynamic.

It's the Go equivalent of Python's ctypes, I think?


Had the same association in mind. Interesting that Python had this since more or less forever but Go is only now rediscovering the possibility.


I wonder whether I can use it to create a Go library that I can import in Python?


My read of the library is it only works the other way: you could import a Python library from Go


That sounds a little scary. I have no idea how the Go gc would interact with Python.

There might be a way, too.


CGo converts Go code to a C based .so library which you can call in Python.


What's the garbage-collection story? One gc per .so? Or is there some mechanism to share a gc instance between several .so compiled by go?


GP is not exactly right; CGO is not involved in building a shared library except for exporting C functions which can be called. It doesn't "convert Go code to C code" or anything like that.

`-buildmode=c-shared` is what produces a shared library with a C ABI exposed; https://golang.org/s/execmodes was the design document for it, which explains:

> It follows that all Go code shares a single runtime. All Go code uses the same memory allocator, the same goroutine scheduler, and in general acts as though it were linked into a single Go program. This is true even when multiple shared libraries are involved.


Thanks!


You can already do this with CGo. The difference between CGo and purego is that CGo requires you have a C toolchain installed while purego seems to allow you to call a function that has already been compiled apart from the Go build system. In either case, you can call the Go library from Python, but the tough bits are translating Go objects into Python objects (and vice versa) as well as making sure object lifetimes are correctly managed (I think this is mostly "copy data rather than passing pointers across the language boundary").


I know about CGo and have used it. The post title, "A library for calling C functions from Go without Cgo", made me curious how do I achieve something similar with PureGo without using CGO


I’m one of the main contributors. I’ve looked into it bc I wanted to know if I could build iOS apps without Cgo. ATM, it is not possible. The reason is because when you run go build creating a shared object it runs the go cgo tool. That tool although written entirely in Go doesn’t know about purego and so will go ahead and import runtime/cgo which requires a C toolchain. Now it could be possible to circumvent that with using a custom Go build toolchain but the goal of purego was to be seemless to use in a project. Just use it and then go build like any other dependency.


You can already do this today with standard toolchain.

https://medium.com/analytics-vidhya/running-go-code-from-pyt...


Why?


Cross-compiling with cgo can be frustrating at times, at least in my own experience. Since Ebitengine is a game engine, and they made this repository, I am presuming it's related.


Using Zig is the easiest way to cross-compile with cgo: https://lucor.dev/post/cross-compile-golang-fyne-project-usi...


I am in love with Zig (as you know), but feel the need to say: please be careful with blanket claims like this. purego is a pretty admirable approach to fixing cgo cross-compilation without Zig (though has its own drawbacks, like no static binary for example.)

Using Zig for Go cross-compilation, although quite great, isn't bulletproof. Finding the right CC/CXX incantation can be fairly tricky, the articles on this are not super up-to-date, and you need a copy of the macOS SDK[0] if you intend to cross-compile for macOS. You may also run into a few scary linker warnings and need to figure out the right Go build flags.

I definitely think Zig and Go can be best-friends, after all they share so many similar qualities and seem quite complementary to each-other. It's definitely possible to get Go<->Zig CGO cross compilation working (I did so for Sourcegraph); But Zig needs a little more love before it'll 'just work' as claimed for the Go use case, so best not to claim otherwise until then.

[0] https://github.com/ziglang/zig/issues/1349


Curious - what is frustrating about it? I found the process very easy and with no weird bugs, but I only made a thin layer over a supplied C SDK, so maybe my use of cgo is not representative?


Make sure you have the righ C cross compiler for the target platform. Then you might encounter the usuall Wrong version of libc etc. So it's not enough to compile for goos=linux for example but also need to make sure that your C cross-compiler has the right version of libc for the target machine or Statically link the version you want, or deploy inside a container etc etc.

Pure Go? don't need anything else beside the Go compiler.

If you can avoid CGO, Avoid it.


Ah, I see, thank you for the explanation. The reason I never had any problems of this kind is that I always compile Go in a container anyway, so the environment is controlled completely. Makes sense that using the C toolchain on the host would be painful, yes.


Setting up the cross-compiling toolchain is a pain in itself, to occasional dabblers and/or when your target is off the beaten track. I can trivially target MIPS-LE with `GOARCH=mipsle GOOS=linux`, but as soon as I added CGO into the mix to support SQLite, things went off the rail and I gave up after trying to set up a mips cross-compiling tool chain in a container


When cross compiling pure Go you set an environment variable, build, and you have a binary ready to run, it is trivial.

When cross compiling with cgo you need a C cross compiling toolchain for the target platform installed with the right libc, etc. It is not impossible and that is how C/C++ is always cross compiled, but it is much more hassle than compiling pure Go.


Can this be used/adapted for Rust too?


Why would rust even have this problem? Rust has native "extern "C"" blocks and good FFI.

The issue in Go is that goroutines run on small stack and C code has no way to know of that or increase the stack size - so Go's C calling facility (cgo) has to go through a thread and a proper stack.

There are some wild assembly hacks to go around it ^^


This project isn't really about calling C functions, Go already supports that after all (CGO). Rust does suffer the same problem, which is that cross-compilation when using C code is a bit of a nightmare.

purego solves this by using dlopen and friends.

My understanding is Rust solves this through libloading (same approach effectively) and more heavy-handed approaches like cross-rs which distribute full C/C++ build toolchains for each target (in Docker images or something?)


I meant calling Rust from Go.


Oh then sure, I don't see why you couldn't use it with something like https://github.com/mediremi/rust-plus-golang


nice, so this is like libffi but in written in go ?


Nice job!


C should have never ben part of Go to begin with. all this "compatibility" for Google's sake made Go for the worse.


> C should have never ben part of Go to begin with

It’s unavoidable on most platforms. Linux is pretty much the only mainstream platform where the syscall interface is considered the stable interface between user and kernel. Other platforms (like macOS, *BSD, and Windows) consider libc (or equivalent in Win32) to be the stable interface.


Win32 in particular is known for guzzling KBs of stack before getting around to making an actual syscall, so you wouldn’t be able to avoid trampolining through an ABI-compatible environment anyway.


This is a must read for you then if you don't understand the role C plays in modern operating systems: https://faultlore.com/blah/c-isnt-a-language/


Unless you want to rebuild the entire world, including the operating system, Go will have to touch the C FFI boundary at some point.


I really really want to learn and use Go, I spent months with it.

In the end, it's a very different paradigm and it is a pain for me to switch from the C-family-syntax to Go and back on a daily basis, and I gave up. Using C, JS, Java, C++, Dart, maybe even Rust(only tried it shortly) on the other hand is much more comfortable and natural for me. Go is just such a different style.

Been vastly different does carry some cost shall I say, Go could be a great language for new programmers however.


Dependencies need updating from time to time whatever the dev team would prefer to be true. In a mono repo, projects like, upgrade all uses of X to X’ can be done by a central team if needed and safe. The dev teams don’t have to do much.

If everyone has their own repos, and there are too many of them, it becomes difficult for central teams to do more than write migration tools and dash boards and nag people.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: