Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collections.ChainMap has wrong type hint for self.maps, rendering it "immutable" #6042

Closed
ramalho opened this issue Sep 17, 2021 · 12 comments
Closed

Comments

@ramalho
Copy link
Contributor

ramalho commented Sep 17, 2021

Typeshed annotates collections.ChainMap as:

class ChainMap(MutableMapping[_KT, _VT], Generic[_KT, _VT]):
    maps: list[Mapping[_KT, _VT]]
    def __init__(self, *maps: Mapping[_KT, _VT]) -> None: ...
    def new_child(self: Self, m: Mapping[_KT, _VT] | None = ...) -> Self: ...

The self.maps attribute holds the chained mappings. It should be annotated as list[MutableMapping[_KT, _VT]].

The *maps parameter of __init__ and the m parameter of new_child need to be MutableMapping[_KT, _VT] as well.

As it is, the implementation in the standard library violates the type hint for self.maps in its __setitem__ method:

    def __setitem__(self, key, value):
        self.maps[0][key] = value
@ramalho ramalho changed the title collections.Chainmap has wrong type hint for self.maps, rendering it "immutable" collections.ChainMap has wrong type hint for self.maps, rendering it "immutable" Sep 17, 2021
ramalho added a commit to ramalho/typeshed that referenced this issue Sep 17, 2021
@JelleZijlstra
Copy link
Member

But what if you make a ChainMap over immutable Mappings? I'm not sure there is a good solution here.

@ramalho
Copy link
Contributor Author

ramalho commented Sep 18, 2021

That's a good point, @JelleZijlstra. However, I still believe we should change the type hints to MutableMapping as I suggest in PR #6044 for two reasons:

  1. The current Mapping hints make code in the standard library itself fail type checks (maybe there is a flaw in how we check type hints against the standard library, I'd like to learn more about that). One example is the ChainMap.__setitem__ method as I mentioned before: it requires at least self.maps[0] to be mutable. But ChainMap also provides new_child, and if that method only accepts Mapping then as soon as it is called the entire ChainMap instance becomes "immutable" as a result, contradicting the declaration of the class itself: ChainMap(MutableMapping[_KT, _VT], Generic[_KT, _VT])

  2. My understanding is that the typeshed policy is to err on the side of false negatives (please correct me if I misundertood). The current Mapping type hints cause false positives. Here is an example in a ChainMap subclass I wrote to implement nested variable scopes with non-local assignment: Environment. The # type: ignore in line 69 is to bypass the issue we are discussing.

@ramalho
Copy link
Contributor Author

ramalho commented Sep 18, 2021

To respond directly to your question: "But what if you make a ChainMap over immutable Mappings?"

I think the answer is: "ChainMap is explicitly declared MutableMapping; if you put immutable mappings into it then you are misusing it and you'd better not call __setitem__ 😉"

Pragmatically, immutable mappings are rare in real Python code, AFAIK. There are no full-fledged implementations in the standard library, only proxies that behave as such.

@JelleZijlstra
Copy link
Member

We do err on the side of false negatives, but in this case that can go either way, because you could get new false positives if you make a ChainMap over an immutable Mapping. The mypy-primer output on your PR shows issues either way: the change makes some casts and type ignores in pandas unnecessary, but introduces new errors in code in chess and edgedb that uses immutable ChainMaps. Whatever decision we make will cause some false positives.

You're right though that ChainMap itself is explicitly declared as a MutableMapping and has some methods that don't make sense if the inner mappings are immutable, so I'm inclined to accept your change, though I'm curious if other maintainers have opinions.

The ideal from a typing perspective would be a separate concepts of mutable and immutable ChainMaps, but I can't think of a sensible way to do that in the stubs.

@srittau
Copy link
Collaborator

srittau commented Sep 20, 2021

Can't we make ChainMap generic over the mapping?

@Akuli
Copy link
Collaborator

Akuli commented Sep 20, 2021

How would we access key and value types in e.g. __setitem__, if the mapping type is a TypeVar but keys and values aren't? By just inheriting from MutableMapping, and not defining some methods in ChainMap even though they are actually defined there?

Also, how would inheritance work? Would it inherit from Mapping or MutableMapping?

@hannes-ucsc
Copy link

Can't we make ChainMap generic over the mapping?

I think that would require python/typing#548 because the mapping type is already generic

Can we offer an ImmutableChainMap that is a Mapping and uses Sequence[Mapping] for the maps attribute?

@kaste
Copy link
Contributor

kaste commented Jul 17, 2022

Ahem, this gives me new mypy complaints on my code base. ChainMap only ever writes to the first mapping in maps. That's my understanding. For example ChainMap({}, defaults) is a typical use-case where the user gets a dict-like and cannot mutate the defaults. That's the typical stacking property of the ChainMap, you only ever mutate the top, most recent mapping.

From my understanding:

def __init__(self):
def __init__(self, __map: MutableMapping):
def __init__(self, __map: MutableMapping, *maps: ImmutableMapping):

I don't think we can have correct hints for the maps member, and new_child of course expects a mutable mapping because iirc it is exactly the next top most thing on the stack.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Jul 17, 2022

That sounds reasonable to me. I would be in favour of:

class ChainMap(MutableMapping[_KT, _VT], Generic[_KT, _VT]):
     maps: list[Mapping[_KT, _VT]]
     def __init__(self) -> None: ...
     def __init__(self, __map: MutableMapping[_KT, _VT], *maps: Mapping[_KT, _VT]) -> None: ...

Also just to call something out:

My understanding is that the typeshed policy is to err on the side of false negatives ... Here is an example in a ChainMap subclass ...

While I'm not sure we've stated this out loud, historically, typeshed has very consistently preferred false positives for subclassers of classes over false positives for other users of classes.

@kaste
Copy link
Contributor

kaste commented Jul 18, 2022

I'm notoriously bad with these overloads but at least I got the double underscores right. What are pro/cons for the maps member to have it list[mapping] vs list[mutable]. Did you have something special in you mind.

  • You may ask which "layer" actually has a specific key.
  • You massage maps popping/appending left/right to make a new ChainMap out of it.

I personally only work with copies of maps. I never mutate them. Typically I would do new_map = ChainMap({}, old.maps[1:]) (etc.)

@hauntsaninja
Copy link
Collaborator

IIUC You can do both of those things with list[Mapping]. You don't want list[MutableMapping] for the same reason that you'd want to have immutable maps in ChainMap in the first place: ideally type checkers should complain about code like collections.ChainMap({}, types.MappingProxyType({})).maps[1]['key'] = 'value'

@kaste
Copy link
Contributor

kaste commented Jul 28, 2022

Opened #8430 to get more feedback and because closed issues cannot move anymore

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants