Log in

No account? Create an account
entries friends calendar profile Scribbles in the Dark Previous Previous Next Next
Knowing Santa Claus is Fake Doesn't Ruin Christmas - Please Visit http://glyph.twistedmatrix.com/ - This Blog Is Closed. — LiveJournal
Sorry about the ads, I can't turn them off.
Knowing Santa Claus is Fake Doesn't Ruin Christmas
There's no such thing as magic. So when someone tells you that you can magically transform blocking code into Deferreds, as in this Python Cookbook posting, From blocking functions to Deferred functions, you should be suspicious.

As Itamar suggested, this particular goal can be accomplished with Twisted's standard twisted.internet.threads.deferToThread, which lacks the horrible, possibly crashing bugs present in the recipe presented above.

But, I'm not really here to talk about the recipe, or to impugn its author, Michele Simionato, who has written several other excellent recipes on ASPN; I have even personally used the DOT-grapher for inheritance hierarchies. I doubt Michele spent much time on this quick hack, or considered it a statement in the holy war I'm about to bring up, so please don't interpret what follows as a personal attack.

What concerns me is that there is a persistent meme around the periphery of the Twisted community that asynchronous programming is too hard, and that things would be easier if it looked like it were multi-threaded. This recently came up in a mailing list post I wrote as well.

My personal opinion on this, and I believe this is a matter of public record, is as follows: CONCURRENCY IS HARD. If you are going to write concurrent programs you need to think about it all the time; you need to plan for race conditions and draw your state-transition diagrams and have big explicit comments in any section of the code that has critical-section requirements even if you don't have to "lock" it as with an event driven system. No inventions have really significantly eased the cognitive difficulty of writing scalable concurrent applications and it is unlikely that any will in the near term. Systems like Twisted and Erlang have both provided powerful tools, but only if you are smart and willing to invest energy in learning to use them properly; they don't make the basic problems any easier. Most of all, threads do not help, in fact, they make the problem worse in many cases. To plagiarize a famous Lisp fellow, if you have a concurrency problem, and you decide to use threads, now you have two problems.

Let's put that aside for the moment, though.

Whether you agree with me about threads or not, though, Twisted was written by, and is maintained by a large group of people who feel basically the same way about this. We have some subtle differences about it, the consensus is the same. Threads are bad. Only use them when you have to, and understand clearly what that means, don't loudly provide "conveniences" for threads or use those "conveniences" for code which could otherwise be written as non-blocking.

Please, Twisted users, please stop trying to turn Twisted into something it isn't. If you want to use threads, write a multi-threaded program and please stop trying to write infrastructure for Twisted to turn it into a big multi-threaded application platform. WSGI and Zope efforts are excluded from this comment, by the way: that's not trying to help people to write threaded Twisted code, that's about trying to help Twisted be the container for code written using a totally different paradigm, on a different framework, and not written to directly use the Twisted libraries.

Programs written with these kinds of thread-happy conveniences are generally the ones which end up the buggiest, the hardest to test, and most likely the least efficient as well. Worst of all, when you do run into those problems, if you ask the Twisted dev team, you are likely to get a lot of smug "I told you so", and very little actual help, since we have seen the problem before and we keep trying to tell folks not to get started down this path. Personally, It's frustrating to have that advice disregarded again and again and still to get help requests from people who ignore it.

Imagine a man walks into a doctor's office, and says, "Doctor doctor, it hurts when I do this OW", promptly shooting himself in the hand with a nailgun. If this is the third time this week the doctor has removed such a nail, do you think the doctor is going to show this patient much patience? Now, imagine the doctor isn't getting paid for his services. The fellow would be lucky to walk out without a second nail...

You may have some awesome ideas about how multi-threaded programs should work. Good for you. I love reading the work of people who think differently than I do and succeed. It's a good way to learn. However, if you ask me, or if you use Twisted, you are going to run into a lot of advice to discard those ideas, a lot of roadblocks related to pervasive multi-threading, and general "impedance mismatch" problems with the differences between the way you think and the way Twisted works. It would probably be less work for you to start from scratch, or to use a system that has threads as a fundamental part of its programming model.

So, please, I'm not offended if you don't like Twisted, but if you like it, appreciate it for what it is, and if you don't, don't bother with it at all. Trying to use it while sweeping the most central parts of it under the rug isn't going to help anyone, least of all you.

Current Mood: pessimistic pessimistic
Current Music: No Man's Land (by Billy Joel on "River of Dreams")

11 comments or Leave a comment
(Deleted comment)
ghd_mk4 From: ghd_mk4 Date: June 26th, 2010 02:37 am (UTC) (Link)

I know someone like you!
(Deleted comment)
glyf From: glyf Date: August 19th, 2005 03:43 pm (UTC) (Link)
Well, in the spirit that this comment was written: FLAME ON

There are hundreds of pages of Twisted documentation, you lazy ass. Read them. There's even going to be a book from O'Reilly coming out soon. Buy it when it comes out.

You know why we haven't finished anything? Because we're working, for free, on a wide variety of applications and infrastructure and nobody, not any of the companies that use it as a cornerstone of their business, not the people who market it as a skill to get good jobs, nobody is paying the development team to work on Twisted or to document it. There are now, at least, a few companies who hire Twisted developers but in no case is it their primary responsibility.

Any cursory google search you do for Twisted and threads will turn up dozens of articles like this one, mostly posted to the twisted-python mailing list. I think this point has been documented to death. Talking about it in a calm, even tone isn't getting the message out there, but a bit of an explosion seems to have gotten attention. I hope that everyone remembers, this time.

As far as the "ach" docstring, I trust that as someone who takes the problem with documentation very seriously, you have already sent a patch to the bugtracker with a clear and explanitory docstring, or have written a check to the appropriate developer. I will be eagerly awaiting your improvements.

So, ahem, flame off, let's leave it all on the field, eh? :)

To your last comment: we have threading modules because threads are a very, very necessary evil. Many libraries are badly written, suggest multithreading, and provide no non-blocking facilities. While cautioning against using them is useful, totally forbidding them by breaking them is just childish. Some others (like PIL and numarray) are extremely computationally intensive and don't ever share state. Still others (the aforementioned Zope and WSGI examples) require infrastructure-level threading, exposing it to the application level as effectively separate processes, with elaborite conflict-resolution and transaction systems, or a full RDBMS as the storage mechanism.

Finally, Twisted devs don't like threads because we know a great deal about them, in most cases. When necessary pretty much everyone on the Twisted dev team can use them to good effect. (Personally I think that almost every Twisted program I've written has needed to spawn one or two threads in its process lifetime - as compared to thousands of deferreds per minute.) We just know how rarely they're actually necessary.

That's why Ousterhout's presentation is "why threads are a bad idea (for most purposes)". That caveat is important. I also think that in general reading and writing to fixed locations in memory is a bad idea for your application too, but kernels need to do a whole heck of a lot of that and I'm glad they do.
metamoof From: metamoof Date: August 21st, 2005 03:21 pm (UTC) (Link)
"There are hundreds of pages of Twisted documentation, you lazy ass. Read them."

Yes, there are. Except there aren't. Or it's poorly organised. Or it's basically hard to get the hang of.

I've slowly been getting to grips with twisted, trying to work it out in my head. I'm releasing a program internally in my company next which uses twisted to do something that does end up using threads, mostly to avoid having to convert swathes of legacy code. Though I see these threads as one event in my pipeline, which either succeeds or fails. and I deal with ti appropriately in a twisted manner. I may get round, eventually, to converting it all to asynchronous code, but it is going to take a lot longer than I have to get the first release out ot do that.

As it is, twisted has taken a *lot* of the pain of having to deal with threads out of my task, for which I am grateful.

But it is still difficult for a newbie to get to grips with twisted, and the documentation is not at all clear sometimes. I'm now having to introduce some of my team to twisted so they can understand what they're doing. Maybe seeing the sorts of things I'm having to go through will help you see some of the problems people refer to when they say "It's not documented".

I set one of them the task of writing a pop-before-smtp component, basically a mail spool that logs into a pop server if a timeout has been arrived at, before attempting to send mail via SMTP. His learning curve was huge, and he was left reeling after three days of attacking the problem, and even after I'd broken it down for him into simple, small steps to try and work out. He's now got something working, without much in the way fo a test framework or whatever, but he reckons he at least understands what he's doing. And it's roughly 30 lines of code. And it's still not perfect, but it at least appears to work the way he wants it to, for now.

My first problem with the whole thing was trying to work out what the hell the reactor could and couldn't do, and how it worked. A quick reference to the reactor would be great, for example. So I look in the API, hoping to find this infamous "reactor" object I import from twisted.internet, only to find it isn't there. Looking at the source, I eventually discover it's a selectreactor. A huge documentation bug you have is that aliases aren't covered, I found a similar problem tryign to work out what the hell a twisted.mail.POP3.AdvancedPOP3Client was.

Then I couldn't work out how to use the POP3 client. I resorted to asking on #twisted, and after a few goes at the thing (including a number of deprecating remarks as to what I was trying to do, I'm not able to change my mail server's policy on how to authenticate connections, I just want to get things done) I eventually work out that I'm probably looking to make a connection with the reactor and then attaching the protocol to it. We faff around for a while tryign to work out where the hell our code should go in a factory, and how to use it, before we discover ClientConnector. We basically floundered due to lack of information on how to create a basic client with twisted.

So, you say I should put my money where my mouth is, I agree. So I've decided to do so. I've filed a few documentation bugs, and I'm going to see if I can try to convert this POP-before-SMTP example into a fully-fledged tutorial.

As it is, twisted is difficult to get your head round. I'm going to have to rewrite twistd to get it to work as a windows service, so I'll end up having to lern a heck of a lot more about twisted internals that way. I'm aware it's been done already, btu only as a deployment solution with py2exe, and I'm not looking todo that. I'll end up copying liberally from the given example, though.
From: puzzlement Date: August 22nd, 2005 09:50 am (UTC) (Link)
Please file documentation bugs. Nice as it would be if I poll LJ for Twisted documentation bugs, I expect filing bugs is even more effective. Only one way to find out...
From: puzzlement Date: August 22nd, 2005 09:53 am (UTC) (Link)
So, you say I should put my money where my mouth is, I agree. So I've decided to do so. I've filed a few documentation bugs, and I'm going to see if I can try to convert this POP-before-SMTP example into a fully-fledged tutorial.

Sorry, I suck. Which bugs? It is going to be the case for the foreseeable future that I work harder on patches than on wishlist bugs so the POP-before-SMTP thing will be interesting.
From: puzzlement Date: August 22nd, 2005 09:55 am (UTC) (Link)
Actually, speaking of things that suck, let's mention the twisted-bugs mailing list, which has been broken since mid-May and which therefore has failed to inform me of new bugs.

Assuming you haven't already done so (I don't know your bug tracker login), can you get yoru bugs, make sure they have the documentation keyword, and assign them to 'hypatia'. There's a good chance they'll be lost for a while otherwise.
metamoof From: metamoof Date: August 22nd, 2005 11:04 am (UTC) (Link)
1143 through 1146

Most of them are assigned to hypatia. But I wasnt' sure what keywords to put in as firefox was killing all the popup windows a bit too zealously. I've added the keywords now.
From: puzzlement Date: August 22nd, 2005 11:52 am (UTC) (Link)
Many thanks! The assignment is probably more important than the keywords anyway.
glyf From: glyf Date: August 22nd, 2005 12:11 pm (UTC) (Link)
Woohoo! Glad to see that this baseless flamewar I started is getting some good results as far as bug reports and concrete suggestions go.

Maybe this could be a slogan: "Twisted: Where LJ Drama Meets Enterprise Computing"
From: puzzlement Date: August 22nd, 2005 09:11 pm (UTC) (Link)
We actually do get about two docs bugs for every extended rant or thread thereof about how bad the documentation is. I have no idea how to increase that number though.
From: oubiwann Date: September 14th, 2006 07:24 pm (UTC) (Link)

The Problem with Threads

I did not see the linked article until today, and I immediately thought of this blog post.


A quote from the closing summary:

"If we expect concurrent programming to become mainstream, and if we demand reliability and predictability from programs, we must discard threads as a programming model."

His suggested solution to the problem is to actually develop/use coordination languages.
11 comments or Leave a comment