According to code comments for MapCompose
The functions can also return `None` in which case the output of that function is ignored for further processing over the chain.
def __call__(, value: Any, loader_context: MutableMapping[str, Any] | None = None) -> Iterable[Any]:
if loader_context:
context = ChainMap(loader_context, self.default_loader_context)
else:
context = self.default_loader_context
Although according to the code, if I interpret it correctly, MapCompose ignores functions if None
is an input instead pushing default_loader_context
down the chain. This makes my code conceptually wrong as the functions that address None are meaningless because they are not executed by MapCompose.
@furas transformed the question to default values.
According to the changelog, support for default field values was removed in v0.14 that is before 2012-10-18. However, an introduction of @dataclass
returned this concept in v2.2.0. Documentation states that attr.s
items also allow to define the type and default value of each defined field, and, similarly to @dataclass
, also do not provide an example. Additionally, get()
method has a default
argument.
get()
method is easy and it replaces None
with "get_method_default"
start_urls = ["https://books.toscrape.com"]
def parse(self, response):
title=response.xpath("//h3/a/text()").get(),
none_get=response.xpath("//h3/b/text()").get(default="get_method_default")
@dataclass
is questionable in my implementation because it returns "dataclass_field_default" only if none_field
is deliberately switched off otherwise it returns None
@dataclass
class NoneItem:
title: str
none_get: str
none_field: str = "dataclass_field_default"
def parse(self, response):
title=response.xpath("//h3/a/text()").get(),
none_get=response.xpath("//h3/b/text()").get(default="get_method_default")
none_field=response.xpath("//h3/b/text()").get()
item = NoneItem(
title=title,
none_get=none_get,
# none_field=none_field
)
yield item
@attr.s()
item is similarly defined and shows the same behavior.
In summary as for now, get()
is a suitable Scrapy method to replace occasional None
with default values.