Have a look at the screenshot below. It's straight from Windows Word, I simply highlighted some words and right-clicked on them. It's obvious that it does take up a large chunk of screen space and that it does partially cover the thing you click on. It's just that we have become so used to these effects in years of Windows usage that we've stopped wondering about them. Since it is easy to make a context menu go away, we don't seem to mind the covered space.
I think you're wrong that people "don't seem to mind". The Word UI is generally derided (it faces a lot of the same challenges as RTS games), and I would argue that this context menu is much longer than it ought to be.
There are some exceptions, and perhaps selecting text is one of them (I wasn't able to replicate this placement, and the menu I get looks quite different), but in general the context menu should always show up right below and to the right of the cursor.
Here's a verb coin of Netherworld, the game that started this discussion. I think that the covering aspect is comparable.
Perhaps, though it's hard to say since the Windows screenshot is cropped. I will say that I think the Netherworld verb coin is quite intrusive, covering up a large part of the screen and obscuring the thing you actually want to do.
I might be wrong, but to me it seems that context menus have become even more prevalent lately than they used to be. For instance, call up a browser such as the Internet Explorer or Edge. The standard way of "opening in another tab", "Save destination" and so on is the context menu. There used to be top-down menus etc., too, but those seem to have fallen out of fashion.
The menu bar is still accessible by pressing Alt. Also, these are not primary actions (I would bet you my parents have no idea about them), and in many cases I would think the more common way to perform them on the desktop is through key combinations (Ctrl-click, Shift-click, Ctrl-S etc.).
Don't take me to mean that I don't think context menus are useful or good for certain purposes. They are very convenient for less common, truly contextual commands. I would still pretty much always recommend having other ways to access the functionality.
I suppose one could force players to LOOK before INTERACTing, whether they want to or not -- by interpreting every first click on any interesting thing as a LOOK action and every subsequent click on it as INTERACT/TALK/PICKUP/COMBINE action. To make this work, you could introduce a property "EXAMINED" for characters, objects and hotspots. The on_mouse_click event would first read out this property and then set it to 1. Afterwards, it would trigger the LOOK event if the property had been 0 beforehand, or else the INTERACT event.
Thus you would only need the left button for both LOOK and INTERACT/TALK/PICKUP/COMBINE.
Yeah, I was gonna say: then you've effectively reverted to a one-button interface, like Ali describes.
It does mean you can't do anything with repeated looks (mostly used for jokes, in my experience), and it also means that after the first look you can't repeat the description (unless you put some rule that would eventually make it revert to "unexamined" state), so you probably shouldn't put can't-miss information in there. In other words: you have to take it into account in the design.
I do sort of like the idea of the timeout, or alternatively a configuration setting. It would be a kind of accelerator option for more experienced players.
I feel like how to deal with one-button/two-button depends to a large extent on your audience. If it's experienced adventure gamers, two-button is fine. If you anticipate some wider interest, tzach's idea of a reminder if you detect they aren't right-clicking is probably good to have. If you are aiming for a broad audience, one-button or this kind of hybrid one/two-button might be necessary.